#! /bin/sed -f #---INDEXER.SED v1.2, by Eric Pement <epement@jpusa.pr.mcs.net>------ # Sed script to alter files with lines with this input format: # Christ, as "firstborn"; 22 # Christ, as "firstborn"; 155 # Christ, as "firstborn"; 194 # into one which replaces the semicolon with a comma, combining the # page numbers into one line, like so: # Christ, as "firstborn", 22, 155, 194 # It is essential that the input file be sorted prior to running this # script, and each line of the input file contain only 1 semicolon. # GNU sort syntax: # sort -t";" +0f -1 +1n input.file > input.sort # # SYNTAX: sed -f INDEXER.SED input.sort > output.file # # The following command causes abort at lines missing a semicolon: /;/ !{ i\ ******************************* \ ERROR - Each line of the input \ file MUST have a semicolon! \ ******************************* \ ^G \ Offending line occurs at this line number: = q } # Following command causes abort at lines with 2 semicolons: /;.*;/ { i\ ******************************* \ ERROR - There may be only ONE \ semicolon on each line! \ ******************************* \ ^G \ Offending line occurs at this line number: = q } # Main body of sed script follows: { :loop $ !N s/^\([^;]*;\) \(.*\)\n\1 \(.*\)/\1 \2, \3/ t loop s/;/,/ P D } #------------------END of SCRIPT---------------------------- ### colorized by sedsed, a debugger and code formatter for sed scripts ### original script: http://sed.sf.net/grabbag/scripts/indexer.sed