- Patent Number:
7,099,876
- Appl. No:
09/211803
- Application Filed:
December 15, 1998
- Abstract:
A multi-field text string data structure is employed to encapsulating identification, meaning, and pronunciation information for a text string. A first field contains the Unicode characters for the text string in a language in which the text string is entered, which may be latin characters, characters which sound-map to latin characters, or one or more ideographs. A second field contains either the same characters or an intermediate representation of the text string, such as syllabary characters for a phonetic spelling of the characters within the first field. A third field contains either the same characters as the first field or a latin character phonetic spelling of the characters in the first field. The first field thus contains the text string in the language in which the text string was entered, while the second and third field contains information about the meaning and pronunciation of the text string. When the characters in the first field are unrecognizable to a user, or when the characters in the first field have more than one meaning or more than one pronunciation, the contents of the second and third fields allow the user to recognize the text string and/or perceive the correct meaning and pronunciation of the text string.
- Inventors:
Hetherington, David James (Austin, TX, US); Kumhyr, David Bruce (Fuquay-Varina, NC, US)
- Assignees:
International Business Machines Corporation (Armonk, NY, US)
- Claim:
1. A text string data structure within a computer usable medium, comprising: a multi-field text string object encapsulating a plurality of discrete fields; a first field within the multi-field text string object containing a first character string representing a word; a second field within the multi-field text string object containing a second character string representing the word; and a third field within the multi-field text string object containing a third character string representing the word; wherein the first character string contains characters for a first human language; and the third character string contains the first character string prefixed by at least one character with a low sort value.
- Claim:
2. The text string data structure of claim 1 , wherein the second character string is different from the first character string.
- Claim:
3. The text string data structure of claim 1 , wherein the first character string contains characters from a first character set employed by a first human language and the second character string contains characters from a second character set employed by a second human language.
- Claim:
4. The text string data structure of claim 1 , wherein the first character string contains characters for a first human language and the second character string contains characters for a second human language which sound-map to characters within the first character string.
- Claim:
5. The text string data structure of claim 1 , wherein the first character sing contains an ideograph and the second character string contains a phonetic spelling of the ideograph.
- Claim:
6. The text string data structure of claim 1 , wherein the third character string is different from the second character string.
- Claim:
7. The text string data structure of claim 6 , wherein the third character string is different from the first character string.
- Claim:
8. The text string data structure of claim 1 , wherein: the first character string contains characters for a first human language; the second character string contain characters for a second human language which sound-map to characters within the first character sting; and the third character string is identical to the first character string.
- Claim:
9. The text string data structure of claim 1 , wherein: the first character string contains an ideograph; the second character string contains Latin characters for a phonetic spelling of the ideograph; and the third character string contains syllabary characters for a phonetic spelling of the ideograph.
- Claim:
10. A method of encapsulating information in a text string data structure, comprising: creating a multi-field text string object encapsulating a plurality of discrete fields; storing a first character string representing a word in a first field within the multi-field text string object; storing a second character string representing the word in a second field within the multi-field text string object; storing a third character string representing the word in a third field within the multi-field text sting object; and storing the first character string prefixed by at least one character with a low sort value as the third character string.
- Claim:
11. The method of claim 10 , wherein the step of storing a second character string representing the word in a second field within the multi-field text string object further comprises: if the first character string contains characters from a first character set employed by a first human language, storing characters from a second character set employed by a second human language in the second field, wherein the second character string is different from the first character string.
- Claim:
12. The method of claim 10 , further comprising: storing characters from a first human language as the first character string; storing characters from a second human language which sound-map to characters within the first character string as the second character string; and storing characters identical to the first character string as the third character string.
- Claim:
13. The method of claim 10 , further comprising: storing an ideograph as the first character string; storing a Latin character phonetic spelling of the ideographs as the second character string; and storing syllabary characters for a phonetic spelling of the ideograph as the third character string.
- Claim:
14. The method of claim 10 , further comprising: storing identical characters as the first, second, and third character strings.
- Claim:
15. A system for encapsulating information in a text string data structure, comprising: means for creating a multi-field text string object encapsulating a plurality of discrete fields; means for storing a first character string representing a word in a first field within the multi-field text string object; means for storing a second charter string representing the word in a second field within the multi-field text string object; means for storing a third character string representing the word in a third field within the multi-field text string object]; and means for storing the first character string prefixed by at least one character with a low sort value as the third character string.
- Claim:
16. The system of claim 15 , wherein the means for storing a second character string representing the word in a second field within the multi-field text string object further comprises: means, if the first character string contains characters from a first character set employed by a first human language, for storing characters from a second character set employed by a second human language in the second field, wherein the second character string is different from the first character string.
- Claim:
17. The system of claim 15 , further comprising: means for storing characters from a first human language as the first character string; means for storing characters from a second human language which sound-map to characters within the first character string as the second character string; and means for storing characters identical to the first character string as the third character string.
- Claim:
18. The system of claim 15 , further comprising: means for storing an ideograph as the first character string; means for storing a Latin character phonetic spelling of the ideograph as the second character string; and means for storing syllabary characters for a phonetic spelling of the ideograph as the third character string.
- Claim:
19. The system of claim 15 , further comprising: means for storing identical characters as the first, second, and third character stings.
- Current U.S. Class:
707/100
- Patent References Cited:
4379288 April 1983 Leung et al.
4384329 May 1983 Rosenbaum et al.
4544276 October 1985 Horodeck
4611280 September 1986 Linderman
4615002 September 1986 Innes
4641264 February 1987 Nitta et al.
4706212 November 1987 Toma
4730270 March 1988 Okajima et al.
4737040 April 1988 Moon
4951202 August 1990 Yan
4954984 September 1990 Kaijima et al.
4962452 October 1990 Nogami et al.
5040218 August 1991 Vitale et al.
5056021 October 1991 Ausborn
5091878 February 1992 Nagasawa et al.
5109352 April 1992 O'Dell
5136503 August 1992 Takagi et al.
5146587 September 1992 Francisco
5164900 November 1992 Bernath
5175803 December 1992 Yeh
5214583 May 1993 Miike et al.
5243519 September 1993 Andrews et al.
5251130 October 1993 Andrews et al.
5268990 December 1993 Cohen et al.
5307267 April 1994 Yang
5339433 August 1994 Frid-Nielsen
5371844 December 1994 Andrew et al.
5377317 December 1994 Bates et al.
5384700 January 1995 Lim et al.
5390295 February 1995 Bates et al.
5416903 May 1995 Malcolm
5418718 May 1995 Lim et al.
5420976 May 1995 Schell et al.
5426583 June 1995 Uribe-Echebarria Diaz De Mendibil
5432948 July 1995 Davis et al.
5434776 July 1995 Jain
5434777 July 1995 Luciw
5440482 August 1995 Davis
5448474 September 1995 Zamora
5485373 January 1996 Davis et al.
5490061 February 1996 Tolin et al.
5523946 June 1996 Kaplan et al.
5546575 August 1996 Potter et al.
5550965 August 1996 Gabbe et al.
5583761 December 1996 Chou
5594642 January 1997 Collins et al.
5600779 February 1997 Palmer et al.
5613122 March 1997 Burnard et al.
5640581 June 1997 Saraki
5640587 June 1997 Davis et al.
5642490 June 1997 Morgan et al.
5644775 July 1997 Thompson et al.
5649223 July 1997 Freeman
5652884 July 1997 Palevich
5675818 October 1997 Kennedy
5677835 October 1997 Carbonell et al.
5678039 October 1997 Hinks et al.
5682158 October 1997 Edberg et al.
5721825 February 1998 Lawson et al.
5724593 March 1998 Hargrave, III et al.
5734887 March 1998 Kingberg et al.
5758295 May 1998 Ahlberg et al.
5758314 May 1998 McKenna
5778356 July 1998 Heiny
5784069 July 1998 Daniels et al.
5784071 July 1998 Tang et al.
5787452 July 1998 McKenna
5790116 August 1998 Malone et al.
5793381 August 1998 Edberg et al.
5799303 August 1998 Tsuchimura
5802539 September 1998 Daniels et al.
5812122 September 1998 Ng
5812964 September 1998 Finger
5815148 September 1998 Tanaka
5828992 October 1998 Kusmierczyk
5832478 November 1998 George
5844798 December 1998 Uramoto
5870084 February 1999 Kanungo et al.
5872973 February 1999 Mitchell et al.
5873111 February 1999 Edberg
5917484 June 1999 Mullaney
5966637 October 1999 Kanungo et al.
5974372 October 1999 Barnes et al.
5978754 November 1999 Kumano
5995101 November 1999 Clark et al.
6003049 December 1999 Chiang
6024571 February 2000 Renegar
6028600 February 2000 Rosin et al.
6047252 April 2000 Kumano et al.
6078935 June 2000 Nielsen
6092037 July 2000 Stone et al.
6144377 November 2000 Oppermann et al.
6167366 December 2000 Johnson
6205418 March 2001 Li et al.
6219632 April 2001 Schumacher et al.
6229622 May 2001 Takeda
6252589 June 2001 Rettig et al.
6321191 November 2001 Kurahashi
6332148 December 2001 Paine et al.
0640913 March 1995
57-199070 December 1982
5-224687 September 1993
7-261652 October 1995
7-271793 October 1995
9-62679 March 1997
9-237270 September 1997
WO 97/404 October 1997
- Other References:
Baldwin, H., Object-Oriented Development: Multicultural C++ Tools Get Internationalization, Thread Safety. Open Systems Today, vol.-, No. 132, pp. 56 (reprinted), Sep. 1993. cited by other
U.S. Appl. No. 09/211,810, filed Dec. 15, 1998, Related Co-Pending Application, David J. Hetherington, et al. cited by other
U.S. Appl. No. 09/211,809, filed Dec. 15, 1998, Related Co-Pending Application, David J. Hetherington, et al. cited by other
U.S. Appl. No. 09/211,808, filed Dec. 15, 1998, Related Co-Pending Application, David J. Hetherington, et al. cited by other
U.S. Appl. No. 09/211,799, filed Dec. 15, 1998, Related Co-Pending Application, David J. Hetherington, et al. cited by other
U.S. Appl. No. 09/211,813, filed Dec. 15, 1998, Related Co-Pending Application, David J. Hetherington, et al. cited by other
U.S. Appl. No. 09/211,801, filed Dec. 15, 1998, Related.Co-Pending Application, David J. Hetherington, et al. cited by other
U.S. Appl. No. 09/211,812, filed Dec. 15, 1998, Related Co-Pending Application, David J. Hetherington, et al. cited by other
New Icons, Oct. 1996, IBM Technical Disclosure Bulletin, vol. 39, No. 10, pp. 25-27. cited by other
Intelligent Computer Keyboard for Entering Texts of Sinhalese and Other Similar Languages, Nov. 1992, IBM Technical Disclosure Bulletin, vol. 35, No. 6, pp. 24-27. cited by other
Enhanced Methods for Spelling Names in Speech Recognition Systems, Nov. 1995, IBM Technical Disclosure Bulletin, vol. 38, No. 11, pp. 45-46. cited by other
Method for Allowing Translation of Operator Input Comparison Strings in an Online Presentation Program, Jan. 1986, IBM Technical Disclosure Bulletin, vol. 28, No. 8, pp. 3682-3683. cited by other
Architecture for Speech Synthesis from Text Recognition Methods, Apr. 1994, IBM Technical Disclosure Bulletin, vol. 37, No. 04A, pp. 287-289. cited by other
Bridging Speech Recognition and Natural Language Processing Subsystems, Jan. 1996, IBM Technical Disclosure Bulletin, vol. 39, No. 01, pp. 229-231. cited by other
- Primary Examiner:
Coby, Frantz
- Attorney, Agent or Firm:
LaBaw, Jeffrey S.
Dillon & Yudell LLP
- Accession Number:
edspgr.07099876