Thursday, 9 February 2012

Monitoring Global Atom Table part I

The aim of this article is to give a sound understanding about Atoms, how to monitor them and check whether we have or have not any process that is potentially leaking atoms. As Microsoft very well defines:
"An atom table is a system-defined table that stores strings and corresponding identifiers. An application places a string in an atom table and receives a 16-bit integer, called an atom, that can be used to access the string. A string that has been placed in an atom table is called an atom name.
The system provides a number of atom tables. Each atom table serves a different purpose. Applications can use local atom tables to store their own item-name associations.
The system uses atom tables that are not directly accessible to applications. However, the application uses these atoms when calling a variety of functions. For example, registered clipboard formats are stored in an internal atom table used by the system. An application adds atoms to this atom table using the RegisterClipboardFormat function. Also, registered classes are stored in an internal atom table used by the system. An application adds atoms to this atom table using the RegisterClass or RegisterClassEx function."
source: Microsoft.

Atoms are stored as two-byte integers (uint16) and there can be 0xFFFF-0xC000=0x4000 (16384) entries maximum. If 0xFFFF is reached ERROR 8 is returned ("System Error. Code: 8. Not enough storage is available to process this command")

Discovering Atom table:
To display atom table we need to use Debugging tools for Windows: Windbg and debug the kernel to dump al registers.

First of all:
Install all necessary tools and then run Windbg and configure all symbol parameters to correctly use kernel debug mode(for windows 7/Server 2008 debug mode needs to be enforced using bcdedit /debug on ). The most important module here is win32k.sys.

Set Symbol file path (Menu File -> Symbol file path):
"symsrv*symsrv.dll*C:\Symbols*http://msdl.microsoft.com/download/symbols"

As we need win32k.sys, we need to generate the pdb files with symbols information using symchk tool from debugging directory:

C:\Program Files\Debugging Tools for Windows (x86)>symchk C:\temp\bb\win32k.sys /v
[SYMCHK] Searching for symbols to C:\temp\bb\win32k.sys in path symsrv*symsrv.dll*C:\Symbols*http://msdl.microsoft.com/download/symbols
DBGHELP: Symbol Search Path: symsrv*symsrv.dll*C:\Symbols*http://msdl.microsoft.com/download/symbols
[SYMCHK] Using search path "symsrv*symsrv.dll*C:\Symbols*http://msdl.microsoft.com/download/symbols"
DBGHELP: No header for C:\temp\bb\win32k.sys.  Searching for image on disk
DBGHELP: C:\temp\bb\win32k.sys - OK
SYMSRV:  win32k.pdb from http://msdl.microsoft.com/download/symbols: 1170455 bytes - copied
DBGHELP: win32k - public symbols
         C:\Symbols\win32k.pdb\D8788E4736B34ED5B8516DBB9D45E9942\win32k.pdb
[SYMCHK] MODULE64 Info ----------------------
[SYMCHK] Struct size: 1680 bytes
[SYMCHK] Base: 0xBF800000
[SYMCHK] Image size: 1839616 bytes
[SYMCHK] Date: 0x43446a58
[SYMCHK] Checksum: 0x001ca511
[SYMCHK] NumSyms: 0
[SYMCHK] SymType: SymPDB
[SYMCHK] ModName: win32k
[SYMCHK] ImageName: C:\temp\bb\win32k.sys
[SYMCHK] LoadedImage: C:\temp\bb\win32k.sys
[SYMCHK] PDB: "C:\Symbols\win32k.pdb\D8788E4736B34ED5B8516DBB9D45E9942\win32k.pdb"
[SYMCHK] CV: RSDS
[SYMCHK] CV DWORD: 0x53445352
[SYMCHK] CV Data:  win32k.pdb
[SYMCHK] PDB Sig:  0
[SYMCHK] PDB7 Sig: {D8788E47-36B3-4ED5-B851-6DBB9D45E994}
[SYMCHK] Age: 2
[SYMCHK] PDB Matched:  TRUE
[SYMCHK] DBG Matched:  TRUE
[SYMCHK] Line nubmers: FALSE
[SYMCHK] Global syms:  FALSE
[SYMCHK] Type Info:    TRUE
[SYMCHK] ------------------------------------
SymbolCheckVersion  0x00000002
Result              0x00130001
DbgFilename
DbgTimeDateStamp    0x43446a58
DbgSizeOfImage      0x001c1200
DbgChecksum         0x001ca511
PdbFilename         C:\Symbols\win32k.pdb\D8788E4736B34ED5B8516DBB9D45E9942\win32k.pdb
PdbSignature        {D8788E47-36B3-4ED5-B851-6DBB9D45E994}
PdbDbiAge           0x00000002
[SYMCHK] [ 0x00000000 - 0x00130001 ] Checked "C:\temp\bb\win32k.sys"

SYMCHK: FAILED files = 0
SYMCHK: PASSED + IGNORED files = 1

Once done, you will see the pdb file generated in C:\Symbols folder and you would be able to load it from Windbg command line using the WinDbg commands http://windbg.info/doc/1-common-cmds.html.

kd> !sym noisy
noisy mode - symbol prompts on
kd> .reload win32k.sys
kd> lm
start    end        module name
804d7000 806cdc00   nt         (export symbols)       ntkrnlpa.exe
bf800000 bf9c1200   win32k     (deferred) 

Now we know that win32k.sys it's loaded in address bf800000 and the referenced memory can be displayed by using dq command (using a 64-bit pointer) and only the first dword (L1):

kd> dq win32k!UserAtomTableHandle L1
DBGHELP: win32k - public symbols  
         c:\symbols\win32k.pdb\D8788E4736B34ED5B8516DBB9D45E9942\win32k.pdb
bf9a7c18  81b4a270`e1c47230

kd> dq 81b4a270`e1c47230+10
e1c47240  e1a841c0`e1ba3008 e15fb8f8`e19fae00
e1c47250  e1ba3080`e1700738 e1546428`e1708f88
e1c47260  e15f4e28`e15f4c78 e19e9058`e1a04278
e1c47270  e15f9d20`e15fb878 e170eda8`e1704d90
e1c47280  e1a0bea8`e1f71658 e1a841f0`e1a61550
e1c47290  e15464a0`e1091388 e1b16bc8`e1542440
e1c472a0  e1001d00`e1a014c0 e181efe0`e1c42118
e1c472b0  e1bb9178`e16f33e0 e15fa858`e19e9078
kd> dq
e170076a  04080000`000081cc 04016d53`6d4d0001
e170077a  00006156`4d430c07 00260012`6b76002c
e170078a  0001007c`fec80000 7250ffff`00010000
e170079a  6f724773`7365636f 7963696c`6f507075
e17007aa  0407ffff`ffffffff d4987346`744e0c05
e17007ba  00000000`0000e16e 4fcf0000`00000000
e17007ca  29880003`00000000 04050000`0000e170
e17007da  0401e24e`4d430001 f6886944`624f0c02

Showing the content of the memory:
Now that we now where Global Atom table is allocated, we can start inspecting its elements by displaying Unicode characters from memory by using du command:
kd> du e1ba3080`e1700738+C
e1700744  "DDEMLUnicodeClient"
kd> du e1546428`e1708f88+C
e1708f94  "ACTIVATESHELLWINDOW"

We can dump the memory using 32-bit pointers (dd command) and get the same results:

kd> dd e1c47240
e1c47240  e1ba3008 e1a841c0 e19fae00 e15fb8f8
e1c47250  e1700738 e1ba3080 e1708f88 e1546428
e1c47260  e15f4c78 e15f4e28 e1a04278 e19e9058
e1c47270  e15fb878 e15f9d20 e1704d90 e170eda8
e1c47280  e1f71658 e1a0bea8 e1a61550 e1a841f0
e1c47290  e1091388 e15464a0 e1542440 e1b16bc8
e1c472a0  e1a014c0 e1001d00 e1c42118 e181efe0
e1c472b0  e16f33e0 e1bb9178 e19e9078 e15fa858
kd> dd e1c472b0
e1c472b0  e16f33e0 e1bb9178 e19e9078 e15fa858
e1c472c0  e1a84198 e1a61500 e195d558 e15ff980
e1c472d0  e15424a0 00000000 00000000 00000000
e1c472e0  00000000 00000000 00000000 00000000
e1c472f0  00000000 00000000 00000000 00000000
e1c47300  00000000 00000000 00000000 00000000
e1c47310  00000000 00000000 00000000 00000000
e1c47320  00000000 00000000 00000000 00000000
kd> du e1ba3008+c
e1ba3014  "Native"
kd> du e15fa858+c
e15fa864  "OleClipboardPersistOnFlush"

To display the whole content we can use the following function to loop through all buckets, listing all atoms created: 

kd> r $t0=poi(poi(win32k!UserAtomTableHandle)+C)
kd> .for(r $t1=0; @$t1<@$t0; r $t1=@$t1+1) {du poi(poi(win32k!UserAtomTableHandle)+10+(@$t1*4))+C}
e1ba3014  "Native"
e1a841cc  "ObjectLink"
e19fae0c  "6.0.2600.2180!tooltips_class32"
e15fb904  "Static"
e1700744  "DDEMLUnicodeClient"
e1ba308c  "DataObject"
e1708f94  "ACTIVATESHELLWINDOW"
e1546434  "FlashWState"
e15f4c84  "SysCH"
e15f4e34  "PBrush"
e1a04284  "6.0.2600.2180!msctls_progress32"
e19e9064  "SysIC"
e15fb884  "DDEMLEvent"
e15f9d2c  "SHELLHOOK"
e1704d9c  "Custom Link Source"
e170edb4  "ControlOfsM*EVJK"
e1f71664  "DesktopSFTBarHost"
e1a0beb4  "SysDT"
e1a6155c  "Link Source"
e1a841fc  "FileName"
e1091394  "ReBarWindow32"
e15464ac  "SysWNDO"
e154244c  "DDEMLAnsiServer"
e1b16bd4  "SysLink"
e1a014cc  "NetworkName"
e1001d0c  "USER32"
e1c42124  "OleDraw"
e181efec  "FileNameW"
e16f33ec  "MoreOlePrivateData"
e1bb9184  "Edit"
e19e9084  "Binary"
e15fa864  "OleClipboardPersistOnFlush"
e1a841a4  "OwnerLink"
e1a6150c  "ListBox"
e195d564  "Embed Source"
e15ff98c  "SysIMEL"
e15424ac  "ComboLBox"

(Source code from http://blog.lordjeb.com/tag/debugging/)
Another way of displaying the content of Global Atom Table is by using !gatom command from user session on Windbg. (http://msdn.microsoft.com/en-us/library/ff563166(v=vs.85).aspx)

0:001> !gatom
Global atom table c00c( 1) = Protocols (18) pinned
c00d( 1) = Topics (12) pinned
c00e( 1) = Formats (14) pinned
c01f(36) = OleEndPointID (26)
c026( 1) = AppProperties (26)
c024( 1) = Delphi00000DB8 (28)
c01e(94) = UxSubclassInfo (28)
c010( 1) = EditEnvItems (24) pinned
c022( 3) = CAddressComboEx_This (40)
c01d( 6) = OleDropTargetMarshalHwnd (48)
c028( 1) = Delphi000009C0 (28)
c008( 1) = StdDoVerbItem (26) pinned
c01c( 6) = OleDropTargetInterface (44)
c02c( 1) = WndProcPtr0040000000000DC4 (52)
c02b( 1) = ControlOfs0040000000000DC4 (52)
c014( 1) = Save (8) pinned
c016( 1) = MSDraw (12) pinned
c02a( 1) = WndProcPtr0040000000000940 (52)
c024( 1) = Delphi00000DB8 (28)
c01e(94) = UxSubclassInfo (28)
c023( 3) = CAutoComplete_This (36)
c007( 1) = StdShowItem (22) pinned
c011( 1) = True (8) pinned
c010( 1) = EditEnvItems (24) pinned
c019( 1) = PROGMAN (14)
c012( 1) = False (10) pinned
c015( 1) = Close (10) pinned
c022( 3) = CAddressComboEx_This (40)
c004( 1) = StdEditDocument (30) pinned
c01d( 6) = OleDropTargetMarshalHwnd (48)
c028( 1) = Delphi000009C0 (28)
c008( 1) = StdDoVerbItem (26) pinned
c01c( 6) = OleDropTargetInterface (44)
c02c( 1) = WndProcPtr0040000000000DC4 (52)
c005( 1) = StdNewfromTemplate (36) pinned
c02b( 1) = ControlOfs0040000000000DC4 (52)
c014( 1) = Save (8) pinned
c016( 1) = MSDraw (12) pinned
c02a( 1) = WndProcPtr0040000000000940 (52)
c002( 1) = StdNewDocument (28) pinned
c027( 1) = ddeconv (14)
c029( 1) = ControlOfs0040000000000940 (52)
c00f( 1) = Status (12) pinned
c017(27) = ThemePropScrollBarCtl (42)
c009( 1) = System (12) pinned
c01b( 3) = AnimationID (22)
c025( 1) = Folders (14)
c00b( 1) = StdDocumentName (30) pinned
c013( 1) = Change (12) pinned
c001( 1) = StdExit (14) pinned
c006( 1) = StdCloseDocument (32) pinned
c01a( 6) = CAddressBand_This (34)
c00a( 1) = OLEsystem (18) pinned
c018( 3) = CC32SubclassInfo (32)

But As we can see, we are not displaying all table completely as it is quite difficult to find the correct entry as the table contains atoms, registered clipboard formats, classes, etc. If we follow the steps from Microsoft debug team, the job is far way easy:

kd> dq win32k!UserAtomTableHandle l1
bf9a7c18  81c58ce0`e1aa1da8
lkd> dq 81c58ce0`e1aa1da8+10 l1
e1aa1db8  e1929b00`e1423c58
kd> dt nt!_HANDLE_TABLE e1929b00`e1423c58
   +0x000 TableCode        : 0xe19012b8
   +0x004 QuotaProcess     : 0xc0040004 _EPROCESS
   +0x008 UniqueProcessId  : 0x06010001 Void
   +0x00c HandleTableLock  : [4] _EX_PUSH_LOCK
   +0x01c HandleTableList  : _LIST_ENTRY [ 0x4d4d49 - 0xc060405 ]
   +0x024 HandleContentionEvent : _EX_PUSH_LOCK
   +0x028 DebugInfo        : (null) 
   +0x02c ExtraInfoPages   : 0n4390977
   +0x030 FirstFree        : 0x490050
   +0x034 LastFree         : 0x50005c
   +0x038 NextHandleNeedingPool : 0x50004e
   +0x03c HandleCount      : 0n3473456
   +0x040 Flags            : 0x310030
   +0x040 StrictFIFO       : 0y0
kd> !list "-t nt!_RTL_ATOM_TABLE_ENTRY.HashLink -e -x \"du @$extret+C\" e1929b00`e1423c58"
du @$extret+C 
e1423c64  "Native"
du @$extret+C 
e19012c4  "Object Descriptor"
du @$extret+C 
e1905f14  "Link Source Descriptor"
du @$extret+C 
e17e3fec  "Button"
du @$extret+C 
e108d014  "msctls_updown32"
du @$extret+C 
e108e224  "6.0.2600.2180!Static"
du @$extret+C 
e1c016f4  "MSIMEMouseOperation"
du @$extret+C 
e112b32c  "WorkerW"
du @$extret+C 
e167218c  "MSUIM.Msg.StubCleanUp"
du @$extret+C 
e1e6f12c  "Desktop More Programs Pane"
du @$extret+C 
e1e949fc  "FileContents"
du @$extret+C 
e1be4c1c  "CMBExecute"
du @$extret+C 
e2218d8c  "C:\WINDOWS\system32\xpsp2res.dll"
e2218dcc  ""
du @$extret+C 
e222a80c  "C:\WINDOWS\WinSxS\x86_Microsoft."
e222a84c  "Windows.Common-Controls_6595b641"
e222a88c  "44ccf1df_6.0.2600.2180_x-ww_a84f"
e222a8cc  "1ff9\comctl32.dll"
du @$extret+C 
e16524dc  "ControlOfs0040000000000DC4"
du @$extret+C 
e1971964  "ControlOfs0040000000000940"
du @$extret+C 
e214b4c4  "ControlOfs0040000000000EC4"
du @$extret+C 
e162340c  "ControlOfs0040000000000274"
du @$extret+C 
e1e7b7fc  "application/x-compressed"

Using Atom table monitor v1.0:
I have developed a small tool to visually monitor all atoms and look for different patterns on it using regular expressions and display the match with a different colour.

This small app will use GlobalGetAtomName (to get all atoms that have been added using GlobalAddAtom) and GetClipboardFormatName (to get all atoms that have been added using RegisterWindowMessage) functions to get all atoms and display them into a 128x128 memory grid using Delphi XE. It also keeps track of the amount of atoms through time plotting the results in a chart.

procedure ScanAtoms;
var
  index: WORD;
  arrAtom, arrClipboard: array [0 .. 1024] of char;
  AtomName, RWMName: string;
  lenAtom, lenClipboard: integer;
begin
  for index := $C000 to $FFFF do
  begin
    lenAtom := GlobalGetAtomName(index, arrAtom, 1024);
    lenClipboard := GetClipboardFormatName(index, arrClipboard, 1024);
    if lenAtom > 0 then
      FATomTable[index - $C000].atom := StrPas(arrAtom)
    else if lenClipboard > 0 then
      FATomTable[index - $C000].atom := StrPas(arrClipboard);
  end;
end;

Features:
  • Display global atom table from 0xC000 to 0xFFFF.
  • Display atom string from table.
  • Look for patterns and match them with a specific colour
  • Counters detailing the matches.
  • Plotting amount of atoms.
Testing:
Tool can be tested by using GlobalAddAtom or RegisterWindowMessages functions and check out the monitor.

Download:
Jordi Corbilla

Related links:

Tuesday, 31 January 2012

Projecting 3D points to a 2D screen coordinate system

In this article I will put forward a small function to project 3D points to a 2D coordinate system. I have used VLO framework to display all the points after the calculation of different common functions. The most difficult part is to correctly project 3D points into a 2D perspective. Basically the way to display the points is by using a viewport and setting it up correctly. Once all the points are calculated, we need to apply the transformations matrix to position the viewport and then draw the 3D object projected into a 2D canvas from the viewport perspective. The following function will use rotation matrix to correctly position the points. 

Sample code:
procedure T3DMesh.CalcPoints();
function d3Dto2D (viewport: T3DViewPort; point : T3DPoint; centre : T2DPoint; zoom : double) : T2DPoint;
var
  p : T3DPoint;
  t : T3DPoint;
  d2d : T2DPoint;
begin
  p.x := viewport.x + point.x;
  p.y := viewport.y + point.y;
  p.z := viewport.z + point.z;

  t.x := (x * cos(viewport.roll)) - (z * sin(viewport.roll));
  t.z := (x * sin(viewport.roll)) + (z * cos(viewport.roll));
  t.y := (y * cos(viewport.pitch)) - (t.z * sin(viewport.pitch));
  z := (t.y * cos(viewport.pitch)) - (t.z * sin(viewport.pitch));
  x := (t.x * cos(viewport.yaw)) - (t.y * sin(viewport.yaw));
  y := (t.x * sin(viewport.yaw)) + (t.y * cos(viewport.yaw));
  d2d := nil;
  if z > 0 then
  begin
      d2d := T2DPoint.Create(((x / z) * zoom) + centre.x, ((y / z) * zoom) + centre.y);
  end;
  result := d2d;
end ;

var
  listPoint : TObjectList;
  i: Integer;
  j: Integer;
  x, y, z: Double;
  d2dpoint : T2DPoint;
begin
  listPoint := TObjectList.Create;
  listPoint2 := TObjectList.Create;

  x :=-2.0;
  y := -2.0;
  z := 0.0;
  for i := 0 to 40 do
  begin
    y := -2.0;
    for j := 0 to 40 do
    begin
      z := cos((x * x + y * y) * 2) * exp(-1 * ((x * x) + (y * y)));
      listPoint.Add(T3DPoint.Create(x,y,z));
      y := y + 0.1;
    end;
    x := x + 0.1;
  end;

  for i := 0 to listPoint.count - 1 do
  begin
    d2dpoint := d3Dto2D(viewport, T3DPoint(listPoint[i]), centre, zoom);
    if Assigned(d2dpoint) then
      listPoint2.Add(d2dpoint);
  end;
  listPoint.Free;
end;


Examples:
z = sin(x) * cos (y):

z = cos((x^2+y^2)*2) * exp-(((x^2)+(y^2))):

z = x^2*exp(-x*x)*y^2*exp(-y*y):


With parameters:

Related links:

Sunday, 15 January 2012

Scripting Language with Thundax P-Zaggy

In this article I will show you the approach taken to solve a common problem: Creating your own Scripting Language using a compiler. Sometimes we need to create our own language and to achieve that we either can create from scratch a lexical analyser in a high level language or use an open source automatic generator of lexical analysers. As I do not want to spent too much time in this task, I obviously chose the second one using an easy approach using JFLEX and CUP. There are many to choose from, e.g. Antlr, JavaCC, SableCC, Coco/R, BYacc/J, Beaver, etc. but I like the way Cup/Jflex work together. Jflex will automatically generate the finite automata of the lexical analyser through the regular expressions that define the tokens from a language. CUP is a system written in Java used to generate LALR syntax analysers. Cup file will contain the syntax definition of the language (Grammar) using a notation similar to BNF (Backus-Naur).
This approach will allow me to create a small compiler which will contain a lexical analyser, a syntax analyser and a semantic analyser without having to add extra workload to my simple Language builder. Cup/Jflex compiler will handle whether the language is well defined or not and in my Delphi project I only have a dummy function that will run if the previous compiler returns 0 errors.
I have also taken advantage of the SynEdit multi-line edit control to enhance the visualization of the simple Scripting Language. The exposed example is pretty basic and it will help to keep the grammar out of the Delphi code. I have tried other Delphi alternatives like GOLD parser and Coco/R but I did not get the expected results. So, If you have any better idea, please do let me know!.
This example will allow the user to define the graph by calling the layout object and connecting one node from another using a numeric description. In left side figure you can actually see the basics of the language and its definition. It invokes the layout object and then it generates a connection from the node(1) to node(2) and it ends the sentence with ";" character. There is no need to define the nodes as if a node does not exist it will be automatically created. The compiler created will manage all kind of errors and warnings and it will inform the user about them. Check that it will inform whether you are committing a lexical error (non recognized symbols, etc) and syntax error (bad composed expressions) through an Error or a semantic error (duplicated lines) through a warning.
A most developed example is shown in the following figure:
Once the language is free of errors, the generator is enabled and P-Zaggy can build the graph using the instructions previously defined and analysed by the external compiler. Note that to be able to check the language you must have installed Java.

The following graph have been created using the basic language:
As you can see it is faster than placing all the objects with the mouse and we can use external tools to generate the script and then check them with the compiler.

Here you can get the latest version of the app:


Jflex regular expressions:

<YYINITIAL>layout {
  if (CUP$parser$actions.verbose)
    System.out.println(yytext());
  pos += yytext().length();
  return new Symbol(sym.layout, new sToken(pos - yytext().length(), yyline, yytext()));
}
<YYINITIAL>node  {
  if (CUP$parser$actions.verbose)
    System.out.println(yytext());
  pos += yytext().length();
  return new Symbol(sym.node, new sToken(pos - yytext().length(), yyline, yytext()));
}
<YYINITIAL>\( {
  if (CUP$parser$actions.verbose)
    System.out.println(yytext());
  pos += yytext().length();
  return new Symbol(sym.OpenBracket, new sToken(pos - yytext().length(), yyline, yytext()));
}
<YYINITIAL>\)  {
  if (CUP$parser$actions.verbose)
    System.out.println(yytext());
  pos += yytext().length();
  return new Symbol(sym.CloseBracket, new sToken(pos - yytext().length(), yyline, yytext()));
}
<YYINITIAL>tonode {
  if (CUP$parser$actions.verbose)
    System.out.println(yytext());
  pos += yytext().length();
  return new Symbol(sym.tonode, new sToken(pos - yytext().length(), yyline, yytext()));
}
<YYINITIAL>parameters  {
  if (CUP$parser$actions.verbose)
    System.out.println(yytext());
  pos += yytext().length();
  return new Symbol(sym.parameters, new sToken(pos - yytext().length(), yyline, yytext()));
}
<YYINITIAL>\. {
  if (CUP$parser$actions.verbose)
    System.out.println(yytext());
  pos += yytext().length();
  return new Symbol(sym.dot, new sToken(pos - yytext().length(), yyline, yytext()));
}
<YYINITIAL>\; {
  if (CUP$parser$actions.verbose)
    System.out.println(yytext());
  pos += yytext().length();
  return new Symbol(sym.endSentence, new sToken(pos - yytext().length(), yyline, yytext()));
}
<YYINITIAL>[0-9]+ {
  if (CUP$parser$actions.verbose)
    System.out.println(yytext());
  pos += yytext().length();
  return new Symbol(sym.INTEGER, new sToken(pos - yytext().length(), yyline, yytext()));
}

[ \t\r\n]+  { pos = 1; }
[a-z]+          {
  System.out.println("ERROR at line "+(Integer.valueOf(yyline)+1) + ": '" +yytext() + "' not recognized");
  pos += yytext().length();
  return new Symbol(sym.ff, new sToken(pos - yytext().length(), yyline, yytext())); }
.   {
  System.out.println("ERROR at line "+(Integer.valueOf(yyline)+1) + ": '" +yytext() + "' not recognized");
  pos += yytext().length();
  return new Symbol(sym.ff, new sToken(pos - yytext().length(), yyline, yytext()));
}

Cup grammar:

terminal        sToken  layout, node, OpenBracket, CloseBracket, tonode, parameters, dot, endSentence, ff;

terminal  sToken          INTEGER, ITEM, QUOTE;

non terminal  scriptClass graphLine;
non terminal    sToken          itemBracket, itemReturned;

graphLine ::=  layout : t1
   dot : t2
   node : t3
   itemBracket : e1
   dot : t6
   tonode : t7
   itemBracket : e2
          endSentence : t10
          graphLine
                        {:
    scriptLines.AddTokens(e1, e2);
    RESULT = scriptLines; :}
          | error;

itemBracket ::= OpenBracket : t4
   itemReturned : e1
   CloseBracket : t5 {: RESULT = (sToken)e1; :}
                        | error;

itemReturned ::= INTEGER : e1 {: RESULT = (sToken)e1; :}
                 | error;

Related links:

Monday, 2 January 2012

Friday, 30 December 2011

Finite automata to Regular Grammar with Thundax P-Zaggy

Moving ahead with FA, now we can generate the right-linear grammar that corresponds to the finite automata generated. The system is pretty simple and basically what the algorithm does is to go recursively through every state and to store the production (A → bC) using the info from the transition and both states.

P-Zaggy will use description text to assign unique variable names to each state. Each transition in the graph matches a production in the right-linear grammar. For example (right side image), the transition from “A” to “B” along the transition “a” matches the production “A → aB”. This production will be automatically added into the production list in the "Finite Automaton (simulation)" tab. To generate the full grammar, just open your finite automata and press "Get grammar" button to get the linear grammar. Now as any final state carries the production to λ, λ symbol has been added to the system as empty string.

In the following image you will see a general example with the automatically generated grammar:


Example with λ string:


Demo video:

Get the latest version from here:

Looking back/ahead:
This would be the last post of the year 2011 and I'm taking this moment to review all the good stuff I have been publishing in my blog, from the physics engine to different code snippets and all of them with the help of a big an bright Delphi community. I always find difficult to get round to writing more often but I always have my notebook with me and every time and idea pops out I am there to catch it. I have an endless list of really interesting stuff I would like to tackle and I hope next year would be more fruitful as there are less than 50 articles for this year (even though the visits have been increased to almost 6K per month wow!, I am grateful for that!). I am also blissfully happy for putting grain of salt in the community as I have widely used the knowledge of others and it is great to pay back.
Next year I will be implementing different algorithms and I will be focusing on grammars, compilers and AI as I am very keen on those subjects and as you already know I like a lot visual stuff (And everything or almost everything with my loved Delphi). I'm preparing a roadmap about different utilities I have in mind and stay tuned for new and very interesting stuff!.
Happy new year 2012!.
Jordi Corbilla

Related links: