Searching

The chapter discusses various search algorithms including linear search for unsorted data, binary search for sorted data, and hashing for efficient key lookups, emphasizing their applications and performance differences.

Chapter Notes: Searching

6.1 Introduction

Searching is the process of locating a specific element, termed the key, within a collection of elements. The chapter outlines the importance of searching in computer science as it contributes to data retrieval efficiency. There are various search techniques that programmers should understand to design effective algorithms for data processing.

6.2 Linear Search

Linear search, also known as sequential search, is the simplest search method. In this method, each element in the list is checked in sequence until the desired element is found or all elements have been checked. Linear search can be implemented as follows:

Algorithm: Linear Search

  1. Set an index to 0.
  2. While the index is less than the number of elements (n):
    • Compare the element at the current index with the key.
    • If a match is found, return the position.
    • If no match is found, increment the index.
  3. If the key is not found after checking all elements, return that the search was unsuccessful.

Example

To illustrate linear search, consider a list:

  • L = [2, 3, 9, 7, 6, 11, 12, 17, 45, 23, 29, 31, -37, 41, 43]
  • If searching for key 17, the algorithm checks elements sequentially until it finds 17 at position 4 (index 3). The total comparisons vary; if the element is not found, it results in n comparisons at worst.

Program Implementation

```python def linearSearch(list, key): for index in range(0, len(list)): if list[index] == key: return index + 1 # Position in list return None # Key not in list ```

Disadvantages of Linear Search:

  • Slow for large datasets since every element must potentially be checked.
  • Performs well on small or unsorted lists.

6.3 Binary Search

Binary search is a more efficient method, but it requires that the data be sorted. This method operates by dividing the search space in half:

  1. Determine the middle element of the array.
  2. Compare the middle element to the key.
  3. If the middle element is equal to the key, the search is successful.
  4. If the key is less than the middle element, repeat the search in the left half; if greater, search in the right half.
  5. Continue this halving process until the key is found or the search space is empty.

Example: Searching Key 17 in List

Given a sorted list, the algorithm may find 17 with minimal iterations:

  • First check middle element.
  • If not found, halve the search space appropriately until reaching the key or confirming absence.

Algorithm: Binary Search

  1. Set first to 0 and last to n-1.
  2. Calculate mid index as (first + last) // 2.
  3. Check if the middle element is equal to the key:
    • If yes, return position.
    • If key < middle, adjust last to mid - 1.
    • If key > middle, adjust first to mid + 1.
  4. If element not found after all checks, return unsuccessful.

Program Implementation

```python def binarySearch(list, key): first = 0 last = len(list) - 1 while first <= last: mid = (first + last) // 2 if list[mid] == key: return mid elif key > list[mid]: first = mid + 1 else: last = mid - 1 return -1 # Not found ```

Performance

  • Binary search significantly reduces search time for large sorted datasets with its logarithmic time complexity, often taking log2(n).

6.3.1 Applications of Binary Search

Binary search is applied in various scenarios, such as:

  • Searching in dictionaries and phone books.
  • Finding min/max values in sorted datasets.
  • Structuring databases more effectively.

6.4 Searching by Hashing

Hashing is a method for very efficient searching, allowing a search in constant time on average:

  1. Elements are processed through a hash function which computes an index based on the element’s value.
  2. The computed index tells where to check for the element in a hash table, making searching instantaneous if no collisions occur.

When hashing, each element's value can be linked to an index by:

  • h(element) = element % size of hash table

Program Example of Hashing

```python

Function to check if a key is present

def hashFind(key, hashTable): if (hashTable[key % 10] == key): return (key % 10) + 1 else: return None # Key not present ```

Collision Resolution

A collision happens when two keys hash to the same index. Techniques for resolving collisions includes chaining and open addressing. A perfect hash function minimizes the chance of collisions.

6.5 Summary of Key Concepts

  • Searching allows locating a specific key in a collection of data.
  • Linear Search: simple but inefficient for large lists; checks each element sequentially.
  • Binary Search: efficient for sorted lists; divides search space in half.
  • Hashing: enables constant-time searching; utilizes hash functions but requires collision management.

Key Points

  1. Searching: Process of locating a specific element in data.
  2. Linear Search: Compares elements sequentially; slow for large datasets.
  3. Binary Search: Efficiently finds elements in sorted data by halving search space.
  4. Hashing: Allows quick lookups using hash functions; requires handling of collisions.
  5. Collision: Occurs when two elements map to the same index in a hash table.
  6. Applications: Broad uses in various algorithms, databases, and data structure optimization.

Key terms/Concepts

  1. Searching locates elements in data collections.
  2. Linear Search is simple but slow for large datasets.
  3. Binary Search is efficient for sorted data, halving the search space.
  4. Hashing offers rapid key lookups via hash functions.
  5. Collision Resolution is needed when multiple keys hash to the same index.
  6. Sorted Data is essential for binary search to function optimally.
  7. Applications include dictionaries, databases, and indexed structures.

Other Recommended Chapters