Finding text within lines in a file with PHP

Have a dream? Start learning your way toward it with courses from $12.99. Shop now for extra savings1px

For a weekly status report, I need to determine the number of work requests that are approved and awaiting implementation. The list of requests in that state is contained in a webpage that contains other information, including requests that are in various other states, such as those awaiting approval. I normally download the webpage containing the information to run scripts against it to extract other information from the page, so I decided to create a PHP script that would display just the list of requests awaiting implementation and produce a count of those requests in that state. On the source webpage the line on the page that marks the start of the section of the page containing the requests that are approved and awaiting implementation contains the text "Requests Waiting Implementation". The HTML code on the page that marks the end of that section contains and ending div tag. So I created the two PHP variables below to hold the two strings I need to search for within the file.

$startString = "Requests Waiting Implementation";
$endString = "</div>";

Since I want to process the HTML file I've downloaded to obtain the data, I need to open it and read it line by line. To do so, I can use the following PHP code:

The Basics of PHP for Web Development
The Basics of PHP for Web Development
1x1 px

$requestsFile = "data/Requests.html";

$file_handle = fopen($requestsFile, "r");

while (!feof($file_handle)) {
   $line = fgets($file_handle);
   <other stuff to do>
}
fclose($file_handle);

The variable $requestsFile stores the location of the file I want to process. The next line stores a reference to that file in the variable $file_handle. The fopen function creates the connection to the file. The function can take two parameters: the file name and the type of access desired. Because I just want to read the contents of the file and not write to it, I can use "r", since it is a text file. If it was a binary file, I would use "rb". I can then read the file line by lined until the end-of-file (EOF) is encountered with a while loop checking for the end of the file with feof($file_handle). The feof function returns TRUE if the file pointer is at the end of the file, or an error occurs; otherwise it returns FALSE. Since I want to continue while not at the end of the file, I can use !feof($file_handle) as the condition for the while loop. The fgets function reads a line from a file that is specified as an argument to the function. In this case, I assign the string returned by the function to the variable $line. After all lines in the file are processed, i.e., the EOF is reached for the HTML file containing the input to be processed by the PHP script, I can close the file with the fclose function.

To determine if a string, e.g., a word or phrase, specifically in this case "Requests Waiting Implementation" is present in a line, I can use the strpos function, e.g., strpos($line, $startString). The function will return the numeric position of the first occurrence of the string for which I want it to search, i.e., $startString, in the variable $line. If the substring on which I'm searching doesn't occur in the string contained in the variable $line, the function will return the Boolean value FALSE. So in this case, I want to set the variable $found_startLine to True if the function doesn't return the value False as shown below. I don't check for the value true, because if the search string is prsent a numeric value is returned.

if (strpos($line, $startString) !== false) {
   $found_startLine = True;
}

Once I've found the starting text I'm looking for in the file, I'll set that variable to True. If I've found the line and I've reached the line in the file containing the ending string, I'll break out of the loop using the break statement. After the script has found the line with the starting text, but before it has found the line with the ending text, I want it to echo, i.e., display, all other lines it has found to its output, in this case to a webpage it is generating. Each line with a request number on it in the input file has a link, i.e., a URL in the line while no other lines do, so I want to count every instance of a line between the starting and ending lines in the file that contains <a href=> which will give me a count of the number of requests. So the entire PHP code is as follows:

<?php

$requestsFile = "data/Requests.html";
$startString = "Requests Waiting Implementation";
$endString = "</div>";
$found_startLine = false;
$count = 0;

$file_handle = fopen($requestsFile, "r");
while (!feof($file_handle)) {
   $line = fgets($file_handle);
   if (!$found_startLine) {
      if (strpos($line, $startString) !== false) {
         $found_startLine = True;
      }
   }
   else {
      if (strpos($line, $endString) !== false) {
         break;
      }
      else {
         echo $line;
         if (strpos($line, "<a href=") !==false) {
            $count = $count + 1;
         }
      }
   }
}
fclose($file_handle);
echo "<p>";
echo "Number of requests awaiting implementation: ", $count;
echo "</p>\n";

?>

The code for the .php file that produces the page, which I run on my MacBook Pro laptop, is in the file weekly_status.php.

Related articles:

  1. Running an Apache web server under OS X El Capitan
  2. PHP for Apache on OS X El Capitan

References:

  1. The right way to read files with PHP
    By: Roger McCoy Published on: February 13, 2007
    Updated: May 17, 2013
    IBM