SRRESWORDNODE

This structure provides information about a word node in a recognition/alternative graph generated by a speech-recognition engine.

Syntax

typedef struct { // srrwn
DWORD
dwNextWordNode;
DWORD
dwUpAlternateWordNode;
DWORD
dwDownAlternateWordNode;
DWORD
dwPreviousWordNode;
DWORD
dwPhonemeNode;
QWORD
qwStartTime;
QWORD
qwEndTime;
DWORD
dwWordScore;
WORD
wVolume;
WORD
wPitch;
VOICEPARTOFSPEECH
pos;
DWORD
dwCFGParse;
DWORD
dwCue;
} SRRESWORDNODE, *PSRRESWORDNODE;

Members

dwNextWordNode
Specifies the node number of the node that follows next in time. For example, if the current node contains “Mail,” the next node would contain “To.” If the next node has several alternatives, this pointer should continue along the path that produces the highest path score at the final node. If there is nothing left in the path, this can be NULL.
dwUpAlternateWordNode
Specifies the number of an alternative node with a higher score than that of the word in this node. For example, if this node contains “Nail,” the alternative node might contain “Mail.” The alternative nodes should be ordered according to the highest word score. If there are no more alternatives, this can be NULL.
dwDownAlternateWordNode
Specifies the number of an alternative node with a lower score than that of the word in this node. For example, if this node contains “Mail,” the alternative node might contain “Nail.” The alternative nodes should be ordered according to the highest word score. If there are no more alternatives, this can be NULL.
dwPreviousWordNode
Specifies the node number of the previous node. If there are several previous nodes, this maps back along the highest path score at the final node. If there is nothing before this in the path, this can be NULL.
dwPhonemeNode
Specifies the node number of the first phoneme in the word. If an engine does not support phonemes or if such a mapping is inappropriate, this should be NULL.
qwStartTime
Time stamp, in bytes, when the audio for this node started. If the value is indeterminate, this is zero.
qwEndTime
Time stamp, in bytes, when the audio for this node ended. If the value is indeterminate, this is zero.
dwWordScore
Specifies the score for this individual word. It is valid to compare this score only against the scores of this word’s alternatives. Engines should make this value a linear probability if possible. If an engine does not know the value, this is zero.
wVolume
Unsigned 16-bit integer that contains the linear volume level of the word, from 1 to 0xFFFF. If the engine does not know the volume, this should be zero. Applications can use this information to determine emphasis or sentence type — for example, question or command — or to do transplanted prosody.
wPitch
Unsigned 16-bit integer that contains the average pitch of the word in hertz. Engines that do not have information about this value set it to zero. Applications can use this information to determine emphasis or sentence type — for example, question or command — or to do transplanted prosody.
pos
Part of speech of the word, if known, for example, noun, verb, adverb, adjective, or conjunction.
dwCFGParse
Specifies the unique identifier of the rule used to parse the word. This is the rule identifier from the SRCFGRULE structure that defines the rule in the grammar. This member is valid only for context-free grammars.

If an imported rule was used to parse the word, dwCFGParse contains the unique rule identifier from the SRCFGIMPRULE structure that defines the imported rule in the grammar. An application can use this information to determine the semantics of a word and save itself the work of parsing the tree. For example, suppose the context-free grammar contains the following rules:

1 = SEQ(Send mail to 2)

2 = ALT(Mike, Bob, Fred)

If the engine recognizes “Send mail to Bob,” the value of dwCFGParse would be 1 for “Send mail to” and 2 for “Bob.”

dwCue
Specifies a cue that the engine has recognized but cannot translate into a word string. A node with a cue may not have a word associated with it. An engine is not required to return cues so an application should not expect them, although it should take advantage of cues that are returned. This member is one of the following flags:
SRRESCUE_COMMA
A comma should go here.
SRRESCUE_DECLARATIVEBEGIN
A declarative sentence has begun. This cue must be followed by a closing SRRESCUE_DECLARATIVEEND.
SRRESCUE_DECLARATIVEEND
A declarative sentence has ended.
SRRESCUE_IMPERATIVEBEGIN
An imperative sentence has begun. This cue must be followed by a closing SRRESCUE_IMPERATIVEEND.
SRRESCUE_IMPERATIVEEND
An imperative sentence has ended.
SRRESCUE_INTERROGATIVEBEGIN
An interrogative sentence has begun. This cue must be followed by a closing SRRESCUE_INTERROGATIVEEND.
SRRESCUE_INTERROGATIVEEND
An interrogative sentence has ended.
SRRESCUE_NOISE
A noise has occurred.
SRRESCUE_PAUSE
The speaker has paused.
SRRESCUE_SENTENCEBEGIN
A sentence has begun. This cue must be followed by a closing SRRESCUE_SENTENCEEND. If the engine has identified the sentence as interrogative, declarative, or imperative, it should use the more specific SRRESCUE_INTERROGATIVEBEGIN, SRRESCUE_DECLARATIVEBEGIN, or SRRESCUE_IMPERATIVEBEGIN cues.
SRRESCUE_SENTENCEEND
A sentence has ended. If the engine has identified the sentence as interrogative, declarative, or imperative, it should use the more specific SRRESCUE_INTERROGATIVEEND, SRRESCUE_DECLARATIVEEND, or SRRESCUE_IMPERATIVEEND cues.
SRRESCUE_UM
The speaker has said a nonsense sound, such as “Um.”
SRRESCUE_WILDCARD
This is a placeholder for a wildcard word in a context-free grammar.
SRRESCUE_WORD
The node contains a word.