Problem Set 5: Speller
Evidence of a 5-point Submission
check
function calculates the hash index once and uses a loop with short-circuiting to inspect that index, usesstrcasecmp
to compare words- Implements a hash function that is more complex than sorting by the first
n
letters of a word or summing the ASCII values of a word, calculates an appropriate value ofN
load
function error handles aftermalloc
andfopen
calls, usesfscanf
to store the new word directly in the table (without using a buffer)size
function returns size in O(1) (handled within theload
function)unload
function uses recursion or a loop to free the entire table, only declares one or two new variables or functions
Evidence of a 4-point Submission
check
function calculates the hash index once and uses a loop with short-circuiting to inspect that index, usesstrcasecmp
to compare words- Student attempts to implement a hash function that is more complex than sorting by the first
n
letters of a word load
function function error handles aftermalloc
andfopen
calls, may use a buffer to transfer a word into the tablesize
function returns size in O(1) (handled within theload
function)unload
function uses recursion or a loop to free the entire table, may declare three or four new variables or functions
Evidence of a 3-point Submission
check
function calculates the hash index once and uses a loop with short-circuiting to inspect that index- Student attempts to implement a hash function that is more complex than sorting by first letter
load
function uses a buffer to transfer a word into the tablesize
function returns size in O(1) (handled within the load function)unload
function iterates through each linked list in each element of the table at most once
Evidence of a 2-point Submission
- Defines
N
to be a number much larger than required by the given hash function check
function calculates the hash index more than once, does not short-circuit, usesstrcmp
to compare strings- hash function is less complex (sorts into fewer indices) than sorting by first letter
load
function does not check thatfopen
ormalloc
have executed successfully, may notfscanf
directly into the node, may unnecessarily initialize the table toNULL
, may use other extraneous variables and conditionals within the loopsize
function returns size in O(n)unload
function iterates through the table using extraneous variables or computation for each element
Evidence of a 1-point Submission
- Defines
N
to be a number much smaller than required by the given hash function check
function does not usestrcmp
orstrcasecmp
to compare words, maymalloc
or do other unnecessary computation for each word in the list- hash function remains unchanged or may produce an index that is outside of the range of [0, N-1].
load
function iterates through the dictionary in a way that is confusing or needlessly complicatedsize
function returns size in O(n)unload
function uses a confusing method to iterate through the table, may visit elements in the table more than once or skip over elements of the table
Example Implementations (Worse vs. Better)
check
Function
Worse Implementation
The example below uses an unnecessary conditional statement
unsigned int hcode = hash(word);
node *trav = table[hcode];
while (trav != NULL)
{
if (strcasecmp(trav->word, word) == 0)
{
return true;
}
else
{
trav = trav->next;
}
}
return false;
Better Implementation
The example below uses no extraneous variables or conditionals
node *trav = table[hash(word)];
while (trav != NULL)
{
if (strcasecmp(trav->word, word) == 0)
{
return true;
}
trav = trav->next;
}
return false;
Hash Function
Worse Implementation
The example below uses a value of N
that will leave empty slots at both the beginning and end of the hash table. It also uses an unnecessary conditional statement
const unsigned int N = 1000000;
unsigned int hash(const char *word)
{
int sum = 0;
for (int i = 0; i < 3; i++)
{
if (strlen(word) > i)
{
sum += toupper(word[i]);
}
else
{
sum += 0;
}
}
return (sum % N);
}
Better Implementation
Although the hash function itself is not better, the example below picks a more appropriate value for N
and uses no unnecessary conditionals or variables. Note that it still wouldn’t be most efficient to call strlen
twice.
const unsigned int N = 72;
unsigned int hash(const char *word)
{
int sum = 0;
int iterations = strlen(word) < 3 ? strlen(word) : 3;
for (int i = 0; i < iterations; i++)
{
sum += toupper(word[i]) - 'A';
}
return sum % N;
}
load
Function
Worse Implementation
The example below uses a buffer and an unnecessary conditional statement (there is no need to check if table[bucket] != NULL)
char new_word[LENGTH + 1];
while (fscanf(dict, "%s", new_word) != EOF)
{
n = malloc(sizeof(node));
if (n == NULL)
{
return false;
}
strcpy(n->word, new_word);
n->next = NULL;
int bucket = hash(n->word);
if (table[bucket] != NULL)
{
n->next = table[bucket];
}
table[bucket] = n;
size_of_dict += 1;
}
// clean up...
Better Implementation
The example below avoids using a buffer by freading directly into the node
// load dictionary...
while (true)
{
node *newnode = malloc(sizeof(node));
if (newnode == NULL)
{
return false;
}
if (fscanf(source, "%s", newnode->word) == EOF)
{
free(newnode);
break;
}
unsigned int hcode = hash(newnode->word);
newnode->next = table[hcode];
table[hcode] = newnode;
dsize++;
}
// clean up...
size
Function
Worse Implementation
The example below includes an unnecessary conditional
if (counter > 0)
{
return counter;
}
return 0;
Better Implementation
The example below includes no unnecessary conditionals
// calculated in load
return dsize;
unload
Function
Worse Implementation
The example below includes an unnecessary conditional and initializes many variables
for (int x = 0; x < N; x++)
{
node *checker = table[x];
while (checker != NULL)
{
node *tmp = checker;
checker = checker->next;
free(tmp);
}
if (x == N - 1 && checker == NULL)
{
return true;
}
}
return false;
Better Implementation
The example below has no unnecessary conditionals and uses recursion to avoid the need for extraneous variables
void vacate_list(node *head)
{
if (head == NULL)
{
return;
}
vacate_list(head->next);
free(head);
}
bool unload(void)
{
for (int i = 0; i < N; i++)
{
vacate_list(table[i]);
}
return true;
}