The complete nucleotide sequence was determined for the Streptococcus sobrinus MFe28 gtfI gene, which encodes a glucosyltransferase that produces an insoluble glucan product. A single open reading frame encodes a mature glucosyltransferase protein of 1,559 amino acids (Mr, 172,983) and a signal peptide of 38 amino acids. In the C-terminal one-third of the protein there are six repeating units containing 35 amino acids of partial homology and two repeating units containing 48 amino acids of complete homology. The functional role of these repeating units remains to be determined, although truncated forms of glucosyltransferase containing only the first two repeating units of partial homology maintained glucosyltransferase activity and the ability to bind glucan. Regions of homology with alpha-amylase and glycogen phosphorylase were identified in the glucosyltransferase protein and may represent regions involved in functionally similar domains.