Problem Set 5: Speller

Evidence of a 5-point Submission

check function calculates the hash index once and uses a loop with short-circuiting to inspect that index, uses strcasecmp to compare words
Implements a hash function that is more complex than sorting by the first n letters of a word or summing the ASCII values of a word, calculates an appropriate value of N
load function error handles after malloc and fopen calls, uses fscanf to store the new word directly in the table (without using a buffer)
size function returns size in O(1) (handled within the load function)
unload function uses recursion or a loop to free the entire table, only declares one or two new variables or functions

Evidence of a 4-point Submission

check function calculates the hash index once and uses a loop with short-circuiting to inspect that index, uses strcasecmp to compare words
Student attempts to implement a hash function that is more complex than sorting by the first n letters of a word
load function function error handles after malloc and fopen calls, may use a buffer to transfer a word into the table
size function returns size in O(1) (handled within the load function)
unload function uses recursion or a loop to free the entire table, may declare three or four new variables or functions

Evidence of a 3-point Submission

check function calculates the hash index once and uses a loop with short-circuiting to inspect that index
Student attempts to implement a hash function that is more complex than sorting by first letter
load function uses a buffer to transfer a word into the table
size function returns size in O(1) (handled within the load function)
unload function iterates through each linked list in each element of the table at most once

Evidence of a 2-point Submission

Defines N to be a number much larger than required by the given hash function
check function calculates the hash index more than once, does not short-circuit, uses strcmp to compare strings
hash function is less complex (sorts into fewer indices) than sorting by first letter
load function does not check that fopen or malloc have executed successfully, may not fscanf directly into the node, may unnecessarily initialize the table to NULL, may use other extraneous variables and conditionals within the loop
size function returns size in O(n)
unload function iterates through the table using extraneous variables or computation for each element

Evidence of a 1-point Submission

Defines N to be a number much smaller than required by the given hash function
check function does not use strcmp or strcasecmp to compare words, may malloc or do other unnecessary computation for each word in the list
hash function remains unchanged or may produce an index that is outside of the range of [0, N-1].
load function iterates through the dictionary in a way that is confusing or needlessly complicated
size function returns size in O(n)
unload function uses a confusing method to iterate through the table, may visit elements in the table more than once or skip over elements of the table

Example Implementations (Worse vs. Better)

`check` Function

Worse Implementation

The example below uses an unnecessary conditional statement

unsigned int hcode = hash(word);
node *trav = table[hcode];

while (trav != NULL)
{
    if (strcasecmp(trav->word, word) == 0)
    {
        return true;
    }
    else
    {
        trav = trav->next;
    }
}

return false;

Better Implementation

The example below uses no extraneous variables or conditionals

node *trav = table[hash(word)];

while (trav != NULL)
{
    if (strcasecmp(trav->word, word) == 0)
    {
        return true;
    }

    trav = trav->next;
}

return false;

Hash Function

Worse Implementation

The example below uses a value of N that will leave empty slots at both the beginning and end of the hash table. It also uses an unnecessary conditional statement

const unsigned int N = 1000000;

unsigned int hash(const char *word)
{
    int sum = 0;

    for (int i = 0; i < 3; i++)
    {
        if (strlen(word) > i)
        {
            sum += toupper(word[i]);
        }
        else
        {
            sum += 0;
        }
    }

    return (sum % N);
}

Better Implementation

Although the hash function itself is not better, the example below picks a more appropriate value for N and uses no unnecessary conditionals or variables. Note that it still wouldn’t be most efficient to call strlen twice.

const unsigned int N = 72;

unsigned int hash(const char *word)
{
    int sum = 0;
    int iterations = strlen(word) < 3 ? strlen(word) : 3;

    for (int i = 0; i < iterations; i++)
    {
        sum += toupper(word[i]) - 'A';
    }

    return sum % N;
}

`load` Function

Worse Implementation

The example below uses a buffer and an unnecessary conditional statement (there is no need to check if table[bucket] != NULL)

char new_word[LENGTH + 1];

while (fscanf(dict, "%s", new_word) != EOF)
{
    n = malloc(sizeof(node));
    if (n == NULL)
    {
        return false;
    }

    strcpy(n->word, new_word);
    n->next = NULL;

    int bucket = hash(n->word);
    if (table[bucket] != NULL)
    {
        n->next = table[bucket];
    }

    table[bucket] = n;
    size_of_dict += 1;
}

// clean up...

Better Implementation

The example below avoids using a buffer by freading directly into the node

// load dictionary...

while (true)
{
    node *newnode = malloc(sizeof(node));
    if (newnode == NULL)
    {
        return false;
    }

    if (fscanf(source, "%s", newnode->word) == EOF)
    {
        free(newnode);
        break;
    }

    unsigned int hcode = hash(newnode->word);

    newnode->next = table[hcode];
    table[hcode] = newnode;

    dsize++;
}

// clean up...

`size` Function

Worse Implementation

The example below includes an unnecessary conditional

if (counter > 0)
{
    return counter;
}

return 0;

Better Implementation

The example below includes no unnecessary conditionals

// calculated in load
return dsize;

`unload` Function

Worse Implementation

The example below includes an unnecessary conditional and initializes many variables

for (int x = 0; x < N; x++)
{
    node *checker = table[x];
    while (checker != NULL)
    {
        node *tmp = checker;
        checker = checker->next;
        free(tmp);
    }

    if (x == N - 1 && checker == NULL)
    {
        return true;
    }
}
return false;

Better Implementation

The example below has no unnecessary conditionals and uses recursion to avoid the need for extraneous variables

void vacate_list(node *head)
{
    if (head == NULL)
    {
        return;
    }

    vacate_list(head->next);
    free(head);
}

bool unload(void)
{
    for (int i = 0; i < N; i++)
    {
        vacate_list(table[i]);
    }

    return true;
}

Problem Set 5: Speller

Evidence of a 5-point Submission

Evidence of a 4-point Submission

Evidence of a 3-point Submission

Evidence of a 2-point Submission

Evidence of a 1-point Submission

Example Implementations (Worse vs. Better)

check Function

Worse Implementation

Better Implementation

Hash Function

Worse Implementation

Better Implementation

load Function

Worse Implementation

Better Implementation

size Function

Worse Implementation

Better Implementation

unload Function

Worse Implementation

Better Implementation

`check` Function

`load` Function

`size` Function

`unload` Function