P r o g r a m    'M o d i f y    T X T - f i l e s'

  1. Definition

    The task of Perl program ModifyingTextFile.pl is to transform a text file, the input file, into another file, the output file. This transformation is realized by replacing all occurrences of a given string with another string. In command line of program ModifyingTextFile.pl following syntax applies:

    perl ModifyingTestFiles.pl [-v] [-i] of=old_file_name [nf=new_file_name] os=old_string ns=new_string

    Rem: Optional arguments have been enclosed within brackets [] that are not part of the syntax i.e. they must not be typed.

    Argument Mandatory Value Signification
    [v] no

    stands for verbose.

    • If not set, only a minimum a information is displayed - probably only that which is of interest for the user.

    • If set, the program will send to terminal a maximum of information about the treatment performed. This may be useful in case the program does not behave as expected.

    [i] no

    stands for ignore case.

    By default the pattern to search for will only be recognized, if letters as well as their case match. If the argument is set, the letter case does not impact on the matching operation.

    of yes old_file_name

    Name of the input file to be transformed.

    [nf] no new_file_name

    Name of the output file, the result of the transformation. If not set, the program assumes file name Output_File.txt.

    Beware that the names of the input and output file must be different. If this condition is not fulfilled, the treatment is aborted.

    os yes old_string

    The string to be searched for - and replaced can be any perl pattern also known as perl regular expression (RE).

    ns yes new_string

    The replacement string will be interpreted as is.

    Tab. Command Line Arguments of ModifyingTextFile.pl

    This kind of search and replace action is included in all text editors available, whereupon editors operate normally on a single file i.e. old file and new file file are identical. In the programing example however, old file and new file must bear different names (see below).

    Beside pedagogic aspects, a Perl program like ModifyingTextFile.pl is of practical interest e.g. for system administrators who want to automatically transform a series of files in batch mode, rather than using their text editor GUI to operate each single file manually.

  2. Flow chart


    Fig.: Scheme of ModifyingTextFile.pl

    According to the chart, task Modify_Text_File decomposes into following steps.

    1. Step [0] Read Command Line
      implicit task, performed at invocation (does not require any Perl statement).

    2. Step [1] Parse Command Line
      extracts the values of parameters governing aspects of the targeted treatment

    3. Step [2] Show Input
      allows the user to control a posteriori, whether his expectations are fulfilled and, if not, to get clues.

    4. Step [3] Initialize Treatment
      opens the involved files (input to read, output to write)

    5. Step [4] Perform Treatment
      replaces all occurrences of a given pattern with a string.

    6. Step [5] Close Treatment
      delivers a summary of results (The most important result is of course the output file).

  3. Coding

    1. Modularity

      Every major step has been encapsulated into a separate procedure:

      1. init_options,
        parse_CL()

      2. show_Command_Line(),
        show_argument_values(),
        show_syntax()

      3. do_treatment()

      4. wri_summary(),
        wri_duration()

    2. Data Manipulation

      • Command line arguments have been encapsulated in structure struct_ModifyingTxtFileOptions

        struct (struct_ModifyingTxtFileOptions ⇒ [

        oldfile ⇒ '$',
        newfile ⇒ '$',
        oldstring ⇒ '$',
        newstring ⇒ '$',
        verbose ⇒ '$',
        ignoreCase ⇒ '$',
        find_of ⇒ '$',
        find_nf ⇒ '$',
        find_os ⇒ '$',
        find_ns ⇒ '$',
        ] );

        Instantiation takes place in following code line
        my $ModifyingTxtFileOptions = struct_ModifyingTxtFileOptions →new();

        In order to enhance clarity, I made the choice to prefix the structure name with struct_ just as I could have prefixed all procedure names with proc_ - which I did not do, because Perl itself marks procedures with an (optional) &.

    3. Pattern Matching

      In the program chart I highlighted in purple those actions that rely on pattern matching.

      • parse command line:
        to extract an option out of string $OptionsStr ( sequence of all command line arguments), following statement is necessary:

        $OptionsStr =~ /OptionIdentifier(.+?);(\s+|$)/, where OptionIdentifier may be of=, nf= etc.

        Basically the statement tells: The next sequence of characters till separator (;) is presumably the value to retrieve - whereat the Perl compiler is non-greedy. Also, Perl stores the retrieved value in a special variable, called $1.

      • search and replace pattern:

        Following statement is necessary: ($line2 = $line1) =~ s/$os/$ns/g;, with the variables having following signification:
        $line1 Current line of text in input file
        $line2 Current line of text in output file
        $os Pattern to search and replace
        $ns Replacement string

        Complete substitution within a single code line! A good example of elegant style: concision without confusion.

  4. Downloads

    Perl sources and one use case are available for download in ZIP_Archive_TxtFileModif from project reference Project_TextFileModif.