Problem Set 5, CS 257

Due before class Monday Oct 20, 1997
Write quickly and you will never write well; write well, and you will soon write quickly.

Clearness is the first essential, then brevity, and vigor.

Correct repeatedly and stoically.

Erasure is as important as writing.

Prune what is turgid, elevate what is commonplace, arrange what is disorderly, introduce rhythm where the language is harsh, modify where it is too absolute.

The best method of correction is to put aside for a time what we have written, so that when we come to it again it may have an aspect of novelty, as of being another man's work; in this way we may preserve ourselves from regarding our writings with the affection that we lavish upon a newborn child.

Marcus Fabius Quintilianus
Roman Poet, Circa 65 AD
  1. Do the following problems from the book:

    15.2
    15.3
    17.12
    19.9


    Here is some junk mail:
    Note the bar code in the lower right hand corner.

    You have been hired by the post office to write software that reads the bar codes bulk mailers are required to produce.

    These bar codes consist of a series of long and short bars. We will call the long bars 1's and the short bars 0's.

    Let's zoom in on the bar code strip itself:

    These strips always begin and end with a 1. In between are five-bar code groups, each of which represents one decimal digit. The correspondence is as follows:
    digit code group
    1 0 0 0 1 1
    2 0 0 1 0 1
    3 0 0 1 1 0
    4 0 1 0 0 1
    5 0 1 0 1 0
    6 0 1 1 0 0
    7 1 0 0 0 1
    8 1 0 0 1 0
    9 1 0 1 0 0
    0 1 1 0 0 0
    So for this bar code:
    we get the bits 11001010001000110011000011110001100011000000111010010100000111. Breaking this up into groups gives 1 10010 10001 00011 00110 00011 11000 11000 11000 00011 10100 10100 00011 1. Looking these up gives the digits 8 7 1 3 1 0 0 0 1 9 9 1.

    Good: the first five digits are the zip code in the address block.

    Note that there are exactly two 1s and three 0s in each five-bit group. (The decimal digit values can be computed using the column weights 7, 4, 2, 1, 0. For example, 01100 is (+ (* 0 7) (* 1 4) (* 1 2) (* 0 1) (* 0 0)) = 6. The exception to this rule is the digit 0, which would come to 11 by the above formula. If you think these bit patterns are neat, you should take a course on information theory.)

    Your task is to write a Scheme procedure that takes the sentence of 0's and 1's output by the physical scanning device and returns a sentence of decimal digits.

    There are many sources of error that cause incorrect or invalid readings: the labels can be torn or smudged, dust and dirt and vibration can cause the scanning device to misregister, etc. For this reason, there are a number of features built into the code to allow such errors to be detected, so the envelope can be diverted for manual processing. For example, the bit patterns above are such that a misreading of 0s as 1s, or alternatively of 1s as 0s, results in an invalid pattern.

    Another such feature is that the last digit in the strip is a ``check digit.'' It is chosen so that the sum of all the digits in the strip comes to a multiple of 10. So if one wanted to encode the number 87131000199 in a bar code strip like this, one would first add the digits, 8+7+1+3+1+0+0+0+1+9+9=39, so the check digit would be a 1, because 39+1=40 which is a multiple of 10.

    To give an example of all the above, in order to encode the zip code 87131, we first append a check digit, giving 871310 (8+7+1+3+1+0 = 20). Then these get converted into into five-bar codes, giving 10010 10001 00011 00110 00011 11000. Then a 1 is appended to each end, to get the final bar pattern: 11001010001000110011000011110001.

    Your job is to write the procedure that will convert a sentence of bars read in, eg (1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 1), into the corresponding sentence of decimal digits, eg (8 7 1 3 1).

    Because of the presence of errors in the strip read in, you must be very careful to call the procedure error if there is any evidence of an incorrectly read strip! This means you must call error if the 1's at the end are not there, or if any five-bar group is invalid, or if the check digit is invalid, or if the total number of bars read doesn't come to 5n+2 for some n>0. (You cannot assume that n=5+1=6, for five-digit zip codes, because (a) sometimes 9-digit zip codes are used, as above, (b) sometimes extra information is encoded beyond this, as above, and (c) you might want to sell your program to some other country that has some other zip code length.)

    The remainder of this problem set involves first writing subroutines that will be useful for this task, and finally writing the actual conversion routine.

  2. Define first-n, a function that takes as its arguments a nonnegative integer n and a sentence sen. It returns a sentence consisting of the first n words of sen.

    For example, (first-n 2 '(aye bee sea dee eye eff)) yields (aye bee).

  3. Define bf-n, a function that takes as its arguments a nonnegative integer n and a sentence sen. It returns a sentence consisting of all but the first n words of sen.

    For example, (bf-n 2 '(aye bee sea dee eye eff)) yields (sea dee eye eff).

  4. Define bars->digit, which takes a sentence of five binary (0/1) values and returns the corresponding digit (0 to 9). If the pattern it was passed is invalid it had better call the error procedure to report the problem! However, it doesn't have to make sure the sentence it was passed has length 5 or contains only 0's and 1's - that can be assumed.

    Eg (bars->digit '(1 0 0 1 0)) yields 8 while (bars->digit '(1 0 1 1 0)) calls error.

  5. Define barcode->zipcode, which takes a sentence of 0's and 1's representing the short vs tall bars of a strip, and returns a sentence of the decimal digits that it encodes.

    Eg (barcode->zipcode '(1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 1)) yields (8 7 1 3 1).

    You can assume that the sentence it is passed contains only 0's and 1's. But you must call error if there is any kind of format error, or if the check digit is incorrect. (The check digit itself should not be part of the sentence returned.)

    You may find it useful to write more helper functions than just those assigned above.

  6. Extra credit: write a better version, barcode->zipcode2, which can still give the correct result if a bar has been read incorrectly. This can be accomplished by (a) recognizing which five-bar group has been corrupted, and (b) using the check digit to figure out what the corrupted digit should have been.

    More extra credit: write barcode->zipcode3 and barcode->zipcode4 which try to recover from a single missing bar or a single extra inserted bar, respectively, as might occur if the envelope were creased or torn. There might be more than one possible correction under these circumstances, so you should return a list of all potential correct sentences of decimal digits.


Barak Pearlmutter <bap@cs.unm.edu>