Useful for Crawling Functions

Functions developed for crawling project at EVS। These functions are now being used be many guys here in EVS. here will find these functions along with a little documentation. feel free to use and modify them.

Useful for Crawling Functions

1) getdataA(startposition,endposition,string)

function getdataA($strStart, $strEnd, $text){

for ($i=0;$i<=strlen($text);$i++)

{

if (substr($text,$i,strlen($strStart))==$strStart)

{

$st=$i;

$k=$i;

while (substr($text,$k,strlen($strEnd))!=$strEnd)

{

$k++;

}

$en=$k+strlen($strEnd);

$start=$st+strlen($strStart);

$tmpstr= substr($text,$start,$k-$start);

}

}

return $tmpstr;

}

Explanation of the function:-

Lets suppose variable $tempdata contains the following string of data

How to use:-

If we use this function i.e getdataA, we should use as

$data = getdataA(��,

, $tempdata)

This function will return us a string, which contains all the data in $tempdata excluding tag i.e it will return us

Mean the starting and ending portions has been removed from the $tempdata.

2) getdataB(startposition,endposition,string)

function getdataB($strStart, $strEnd, $text){

for ($i=0;$i<=strlen($text);$i++)

{

if (substr($text,$i,strlen($strStart))==$strStart)

{

$st=$i;

$k=$i;

while (substr($text,$k,strlen($strEnd))!=$strEnd)

{

$k++;

}

$k=$k+strlen($strEnd);

$start=$st;

$tmpstr= substr($text,$start,$k-$start);

}

}

return $tmpstr;

}

Explanation of the function:-

Lets suppose variable $tempdata contains the following string of data

How to use:-

If we use this function i.e getdataA, we should use as

$data = getdataA(��,

, $tempdata)

This function will return us a string, which contains all the data in $tempdata i.e it will return us. So by using this function i.e getdataB we can get all the data including starting and ending points motioned in the function.

Main difference between getdataA and getdataB

In the light of above given few examples we can easily conclude the difference between the functions getadataA and getdataB , i.e getdataA returns a string between the specified string or tag while getdataB returns the whole string including the starting and ending tag.

3) getdataC(startposition,endposition,string)

function getdataC($strStart, $strEnd, $text){

for ($i=0;$i<=strlen($text);$i++)

{

if (substr($text,$i,strlen($strStart))==$strStart)

{

$st=$i;

$k=$i;

while (substr($text,$k,strlen($strEnd))!=$strEnd)

{

$k++;

}

$en=$k+strlen($strEnd);

$start=$st+strlen($strStart);

$tmpstr= substr($text,$start,$k-$start);

break;

}

}

return $tmpstr;

}

Explanation of the function:-

This function returns us the data which occurs for the first time.Lets suppose variable $tempdata contains the following string of data

How to use:-

If we use this function i.e getdataC, we should use as

$data = getdataA(��, , $tempdata)

This function will return us a string, which contains following data form $tempdata i.e it will return us

Evision

So by using this function i.e getdataC we can get all the data including starting and ending points motioned in the function.

4) Function function reducere($text)

function reducere($text)

{

$a = array("\r", "\t", "\n");

$r = str_replace($a, '', $text);

return $r;

}

Explanation of the function:-

This function is used to remove extra spaces new line breaks and tabs.

5) function getdataArray(startposition,endposition,string)

function getdataArray($strStart, $strEnd, $text){

for ($i=0;$i<=strlen($text);$i++)

{

if (substr($text,$i,strlen($strStart))==$strStart)

{

$st=$i;

$k=$i;

while (substr($text,$k,strlen($strEnd))!=$strEnd)

{

$k++;

}

$k=$k+strlen($strEnd);

$start=$st;

$tmpstr[]= substr($text,$start,$k-$start);

}

}

return $tmpstr;

}

Explanation of the function:-

This function returns us an array of data.Lets suppose variable $tempdata contains the following string of data

How to use:-

If we use this function i.e getdataA, we should use as

$data = getdataArray, , $tempdata)

This function returns us an array which include data such as

$data[0] = Evision

$data[1] = Software

$data[2] = Islamabad

So by using this function i.e getdataArray we can get an array of data excluding starting and ending tag.

0 comments: