Introduction

The phpMySearch search engine system is a complete world wide web indexing and searching system for a small domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Webcrawler and AltaVista. Instead it is meant to cover the search needs for a single company, campus, or even a particular sub section of a web site.

Search Engine utilizes PHP, MySQL, CURL library and Adobe PDF to HTML converting gateway.

 

 

System Requirements
 

 

Installation instructions
  1. Unzip the downloaded archive to a folder where you’d like to place search engine.
  2. Edit file Src/DataBaseSettings.inc.php and put appropriate values for the following variables:

·         $DBName        Name of MySQL database which will be used by search engine

·         $DBUser          MySQL username

·         $DBPassword      MySQL password

·         $DBHost           Host where your MySQL server resides

iNOTE: All required tables will be created automatically.

  1. After that you must change the CHMOD to 666 to all Log-Files in the Folder ./log.
  2. That´s all.  :-)

 

 

Administration Interface

To access administrative interface run script admin.php

 

Default login and password are:

login:                admin

password:             admin

iRECOMMEND TO CHANGE

 

 

Table 1-1: Fields and buttons on the Administration page.

Field or button

Description

DB Main table name

Name of the table, which will be used, for storing and indexing documents. The default name is phpMySearch_Pages

DB Settings table name

Name of the table, which will be used, for storing search engine settings. The default name is phpMySearch_settings

DB Spider state table name

Name of the table, which will be used, for storing spider working state. The default name is phpMySearch_spider

Search start URL's:

A list of URLs from which crawler will start to gather information. To add new URL to the list type it in the field below and push ADD button. To remove any of the URLs check the checkboxes near URLs you’d like to delete and push REMOVE button.

Not indexed URL's (Black list)

List of URLs, which will be ignored by crawler. . To add new URL to the list type it in the field below and push ADD button. To remove any of the URLs check the checkboxes near URLs you’d like to delete and push REMOVE button.

Document extensions to index

A list of document extensions which spider should try to index. To add new document extension to the list type it in the field below and push ADD button. To remove any of the extensions check the checkboxes near extensions you’d like to delete and push REMOVE button.

Search depth

Search depth tells spider how much iteration he should follow links from the pages and proceed with crawling.

 

0 - don't follow any links

1 - follow links only from the first page

2 - follow links from the first page + 1

3 - follow links from the first page + 2

Reparse all

If check box is checked spider will clean database and start to ramble, otherwise it will parse only pages that were updated. By default it is unchecked

Automatic spider start

Check this box if you have not access to crontab tool at *nix systems or task scheduler in Windows system. If it is checked each time visitor use search script, it will check whether it is time to start the crawler. If it founds that it is time to start the time or crawler was not started at specified time script will start the spider script.

 

It is recommended that you use system scheduling utilities to start the spider.

Start time

Time to start the spider

Start spider each (days)

Period in days to restart the spider (in days)

Force crawling

Click on Start Spider button to start spider immediately

Number of links per page

Specify here number of links, which should be displayed on a single page.

Max pages block

Specify how many pages should be visible in pages menu

Search Engine log file name

Enter path to the file where search engine will log its work

Spider Engine log file name

Enter path to the file where spider will log its work

Admin Tool log file name

Enter path to the file where will be logged all changes in admin tool

Templates path

Path to template files

Admin Login

Badminton login name of administrator

Admin Password

Administrator’s password

Confirm Password

Confirmation of administrator’s password

Submit

By clicking on Submit button you’ll save all changes

 

 

Searching

If you want to search for a single word it is simple. Just type the word you’d like to search for and press Submit.

 

You also can use Boolean logic to narrow search. See table below for operators allowed.

 

Table 2-1:Search Boolean logic.

Operator

Description

AND

+

Finds documents containing all of the specified words or phrases. Peanut AND butter finds documents with both the word peanut and the word butter.

OR

&

Finds documents containing at least one of the specified words or phrases. Peanut OR butter finds documents containing either peanut or butter. The found documents could contain both items, but not necessarily.

AND NOT

+-

Excludes documents containing the specified word or phrase. Peanut AND NOT butter finds documents with peanut but not containing butter. NOT must be used with another operator, like AND. Search engine does not accept 'peanut NOT butter'; instead, specify peanut AND NOT butter.

OR NOT

&-

Finds documents containing one of the specified words or phrases or not containing other word. Peanut OR NOT butter finds documents which contain Peanut or not containing butter

“”

Quotation marks are used to denote exact phrases. For example, a search on "New York Times" will match only documents containing the words as an exact phrase. It will not find pages with the words used in a different order, such as "New times in York!"

{ }

Braces are used to denote folders. For example, a search on "CPAN/objects" will match only documents stored in www.servername.com/currentlocation/CPAN/objects

 

 

You also can navigate through the site folder structure. In the dropdown box before Submit button you will see list of subfolders of the current folder. By selecting the folder name you localize search to this folder and its subfolders. ‘..’ allows you to go one step up.

 

 

Urgent infos

 

main

Please note if phpMySearch visits Websites and store into your database, that you need the agreement from the websiteowner, that you are allowed to do this.

 

costs

If the phpMySearch-Spider visitswebsites, which are not local on the same webserver, that you get much datatransfer between both servers. From this reason, it can be, that you get much costs.

 

updates

You can subscribe the newsletterservice at http://phpMySearch.web4.hm and we will inform you automaticly from updates.

Also, you can find everytime updates at this homepage.

 

 

 

 
Terms Of Usage

 

Copyright (c) 2001,2002 phpMySearch-TEAM
All rights reserved.
Internet: http://phpMySearch.web4.hm, Email: phpMySearch@web4.hm
 
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

 

  1. You may use this software and documentation for free,
    so long this license and the copyright notice, in the software and documentation and at the end of the output are preserved.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  3. You are allowed to modify the HTML-Code.
  4. You are allowed to modify the sourcecode, when you mail us the changed sourcecode to phpMySearch@web4.hm
  5. Free redistributions of any form whatsoever must retain the following acknowledgment:

"Original product freely available from http://phpMySearch.web4.hm" or

"Original-Produkt frei erhaeltlich bei http://phpMySearch.web4.hm".

  1. The phpMySearch.-TEAM may publish revised and/or new versions of the license from time to time.
7.      Changes, innovations, misprints and mistakes are reserved.
8.      This product is not free for reselling - reselling free code parts from http://phpMySearch.web4.hm a special product-registration.
9.      The name must not be used to endorse or promote products derived from this software without prior written permission from the phpMySearch.web4.hm-TEAM. This does not apply to add-on libraries or tools that work in conjunction with the software. In such a case the name may be used to indicate that the product supports it.
10.  phpMySearch.web4.hm is allowed to forbid the usage of this program or code-parts.
11.  THE COVERED CODE IS PROVIDED "AS IS" AND WITHOUT WARRANTY, UPGRADES OR SUPPORT OF ANY KIND. NO ORAL OR WRITTEN INFORMATION OR ADVICE GIVEN BY PHPMYSEARCH, A PHPMYSEARCH AUTHORIZED REPRESENTATIVE OR ANY CONTRIBUTOR SHALL CREATE A WARRANTY. THIS PROGRAM IS DISTRIBUTED IN THE HOPE THAT IT WILL BE USEFULE, BUT WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.