Chapter Notes: Searching
6.1 Introduction
Searching is the process of locating a specific element, termed the key, within a collection of elements. The chapter outlines the importance of searching in computer science as it contributes to data retrieval efficiency. There are various search techniques that programmers should understand to design effective algorithms for data processing.
6.2 Linear Search
Linear search, also known as sequential search, is the simplest search method. In this method, each element in the list is checked in sequence until the desired element is found or all elements have been checked. Linear search can be implemented as follows:
Algorithm: Linear Search
- Set an index to 0.
- While the index is less than the number of elements (n):
- Compare the element at the current index with the key.
- If a match is found, return the position.
- If no match is found, increment the index.
- If the key is not found after checking all elements, return that the search was unsuccessful.
Example
To illustrate linear search, consider a list:
L = [2, 3, 9, 7, 6, 11, 12, 17, 45, 23, 29, 31, -37, 41, 43]
- If searching for key
17, the algorithm checks elements sequentially until it finds 17 at position 4 (index 3). The total comparisons vary; if the element is not found, it results in n comparisons at worst.
Program Implementation
```python
def linearSearch(list, key):
for index in range(0, len(list)):
if list[index] == key:
return index + 1 # Position in list
return None # Key not in list
```
Disadvantages of Linear Search:
- Slow for large datasets since every element must potentially be checked.
- Performs well on small or unsorted lists.
6.3 Binary Search
Binary search is a more efficient method, but it requires that the data be sorted. This method operates by dividing the search space in half:
- Determine the middle element of the array.
- Compare the middle element to the key.
- If the middle element is equal to the key, the search is successful.
- If the key is less than the middle element, repeat the search in the left half; if greater, search in the right half.
- Continue this halving process until the key is found or the search space is empty.
Example: Searching Key 17 in List
Given a sorted list, the algorithm may find 17 with minimal iterations:
- First check middle element.
- If not found, halve the search space appropriately until reaching the key or confirming absence.
Algorithm: Binary Search
- Set first to 0 and last to n-1.
- Calculate mid index as (first + last) // 2.
- Check if the middle element is equal to the key:
- If yes, return position.
- If key < middle, adjust last to mid - 1.
- If key > middle, adjust first to mid + 1.
- If element not found after all checks, return unsuccessful.
Program Implementation
```python
def binarySearch(list, key):
first = 0
last = len(list) - 1
while first <= last:
mid = (first + last) // 2
if list[mid] == key:
return mid
elif key > list[mid]:
first = mid + 1
else:
last = mid - 1
return -1 # Not found
```
Performance
- Binary search significantly reduces search time for large sorted datasets with its logarithmic time complexity, often taking log2(n).
6.3.1 Applications of Binary Search
Binary search is applied in various scenarios, such as:
- Searching in dictionaries and phone books.
- Finding min/max values in sorted datasets.
- Structuring databases more effectively.
6.4 Searching by Hashing
Hashing is a method for very efficient searching, allowing a search in constant time on average:
- Elements are processed through a hash function which computes an index based on the element’s value.
- The computed index tells where to check for the element in a hash table, making searching instantaneous if no collisions occur.
When hashing, each element's value can be linked to an index by:
h(element) = element % size of hash table
Program Example of Hashing
```python
Function to check if a key is present
def hashFind(key, hashTable):
if (hashTable[key % 10] == key):
return (key % 10) + 1
else:
return None # Key not present
```
Collision Resolution
A collision happens when two keys hash to the same index. Techniques for resolving collisions includes chaining and open addressing. A perfect hash function minimizes the chance of collisions.
6.5 Summary of Key Concepts
- Searching allows locating a specific key in a collection of data.
- Linear Search: simple but inefficient for large lists; checks each element sequentially.
- Binary Search: efficient for sorted lists; divides search space in half.
- Hashing: enables constant-time searching; utilizes hash functions but requires collision management.
Key Points
- Searching: Process of locating a specific element in data.
- Linear Search: Compares elements sequentially; slow for large datasets.
- Binary Search: Efficiently finds elements in sorted data by halving search space.
- Hashing: Allows quick lookups using hash functions; requires handling of collisions.
- Collision: Occurs when two elements map to the same index in a hash table.
- Applications: Broad uses in various algorithms, databases, and data structure optimization.