2.3.3.2.1 Finding the Longest Match to Input
The purpose here is to scan over the dictionary to locate the longest string. It is important that, as the code finds a new longest match, the newly matched character SHOULD be added to the dictionary at that time (refer to the AddByteToDictionary procedure calls in the pseudo-code as follows).
In the case where the length of the match is zero, the literal that is being searched for MUST be added to the dictionary.
The scan MUST begin at the dictionary write offset plus 1 when the dictionary end offset is equal to 4096 bytes. When the end offset is less than 4096 bytes, the scan MUST begin at index zero. The scan SHOULD stop when 17 characters are matched but MUST stop after the finalOffset position is scanned, where finalOffset is defined as the dictionary write offset modulo 4096.
Matches that start at or before finalOffset and match across finalOffset allow a repeating sequence of characters, such as "XYZXYZXYZXYZ", to be represented as a series of appropriate initial literals ('X' 'Y' 'Z') and a single dictionary reference. (This example generates an offset of 210 and a length of 9, assuming that the dictionary is initialized as specified in section 2.1.2.1.) For a more detailed example, see section 3.2.2.
The algorithm computes the longest match in the dictionary of the current position within the input by using one of multiple implementation-dependent mechanisms. The following pseudocode is provided as one example; however, it is not necessary to follow this exactly, so long as the decompression algorithm specified in section 2.2 generates the original input given the compressed output generated.
-
PROCEDURE FindLongestMatch SET finalOffset to the Write Offset of the Dictionary modulo 4096 IF the Dictionary's End Offset is not equal to the Dictionary buffer size THEN SET matchOffset to 0 ELSE SET matchOffset to (the Dictionary's Write Offset + 1) modulo 4096 ENDIF SET bestMatchLength to 0 REPEAT CALL TryMatch with matchOffset and the Input Cursor SET matchOffset to (matchOffset + 1) modulo 4096 UNTIL matchOffset equals finalOffset OR until bestMatchLength is 17 bytes long IF bestMatchLength is 0 THEN CALL AddByteToDictionary with the byte at Input Cursor ENDIF RETURN offset of bestMatchOffset and bestMatchLength ENDPROCEDURE PROCEDURE TryMatch SET maxLength to the minimum of 17 and remaining bytes of Input SET matchLength to 0 SET inputOffset to the Input Cursor SET dictionaryOffset to matchOffset WHILE matchLength is less than maxLength AND the byte in the Dictionary at dictionaryOffset is equal to the byte in Input at the inputOffset INCREMENT matchLength IF matchLength is greater than bestMatchLength THEN CALL AddByteToDictionary with the byte in Input at the inputOffset ENDIF INCREMENT inputOffset SET dictionaryOffset to (dictionaryOffset + 1) modulo 4096 ENDWHILE IF matchLength is greater than bestMatchLength THEN SET bestMatchOffset to matchOffset SET bestMatchLength to matchLength ENDIF ENDPROCEDURE PROCEDURE AddByteToDictionary SET the byte at the Dictionary's current Write Offset to the provided byte IF the Dictionary's End Offset is less than the buffer size THEN INCREMENT the End Offset ENDIF SET the Dictionary's Write Offset to (the Dictionary's Write Offset + 1) modulo 4096 ENDPROCEDURE