Poker Hand XML - Page 10

_dave_ · #91 08-01-2007, 07:27 PM

I know very little about XML, thus have litle to say in this thread (but wholeheartedly support this movement, and will no doubt learn howto XML as a result) - but to a noob like me, It seems OrkaDK is right on the money with this one.

human readability has little place (especially at the expense of functionality) in the raw XML, except for debugging matters as were pointed out earlier - surely when being read by humans the raw XML will be transformed via XSLT?/whatever to a far more readable text?

Knowing little as I do, I would be inclined to go with this representation:
[ QUOTE ]
I think every card should be expressed like so
<cards (other attributes...)>
<card value="a" suit="h" >
<card value="9" suit="c" >
</cards>

[/ QUOTE ]

mainly thinking it would be far easier to retrieve hands containing 2-flush / 3-flush / rainbow flops with no string parsing involved... am I wrong here?

dave.

OrcaDK · #92 08-01-2007, 07:37 PM

[ QUOTE ]
I will answer your questions with 1 simple question:
In both examples which data type will the "8d" validate to?

[ QUOTE ]
the primary reason for creating a standardized XML format is to make machine based interchangeability more easy, not the make it more readable by humans.

[/ QUOTE ]

This I agree with. Which makes me wonder why some of the examples posted here have like half a dozen nested elements and includes information that the hand analyzer should be in charge of.

To me a hand history is a way to store a hand, nothing else. It's telling a story. The hand analyzer should be the tool that fills in the details and allows us to investigate the story piece by piece.

[/ QUOTE ]

"8d" will be validated as a string through regexp. How the hand analyzer / [application that reads the XML file] represents 8d, either as a string or as a "Hand" class, that's up to the application itself.

I completely agree with you that the XML format should only store the hand in a 100% objective format, meaning no textual representations of the results, that should be up to the application itself to create. At most we could include standardized hand strength values like {"Flush", "Straight Flush", "High Card"}, but no more.

Shoe Lace · #93 08-01-2007, 08:06 PM

[ QUOTE ]
mainly thinking it would be far easier to retrieve hands containing 2-flush / 3-flush / rainbow flops with no string parsing involved... am I wrong here?

[/ QUOTE ]

How do you propose we get this information without having to parse anything?

The code required to check if it's 1/2/3 tone would be different depending on which method you used. Either one would require some type of logic or direct parsing.

Ex. for a string that contains this -> "4c Js Tc"

You could count the occurrences of each suit in the string (using IndexOf()). If a suit occurs twice then it's 2 tone.

Ex. doing it the other way...

You could compare the suit attribute values for each element against each other to see if any match and keep track of the count for each suit.

Don't forget that the libraries or classes that certain languages have for XML are filled with code we didn't write. I'd be curious to see what the actual source code for the entire xml namespace in .NET looks like. Surely they are hiding all the dirty parse work from you, allowing you to use neatly packed classes instead.

OrcaDK · #94 08-01-2007, 08:11 PM

[ QUOTE ]
The code required to check if it's 1/2/3 tone would be different depending on which method you used. Either one would require some type of logic or direct parsing.

Ex. for a string that contains this -> "4c Js Tc"

You could count the occurrences of each suit in the string (using IndexOf()). If a suit occurs twice then it's 2 tone.

Ex. doing it the other way...

You could compare the suit attribute values for each element against each other to see if any match and keep track of the count for each suit.

Don't forget that the libraries or classes that certain languages have for XML are filled with code we didn't write. I'd be curious to see what the actual source code for the entire xml namespace in .NET looks like. Surely they are hiding all the dirty parse work from you, allowing you to use neatly packed classes instead.

[/ QUOTE ]

Using IndexOf is not a viable solution, as again, it can't be done on the fly for database / XPath querying (in database querying we'd be able to make a simple COUNT(1) GROUP BY [Suit] to see if there were any flushes.

And yes, all XML libraries hide the "ugly" stuff, but that's exactly what they're supposed to do. That is the main thought behind object oriented programming, that we use already made libraries that we know work. There's no reason for us to reinvent the wheel, we can concentrate on the domain specific problems.

Shoe Lace · #95 08-01-2007, 08:31 PM

[ QUOTE ]
Using IndexOf is not a viable solution, as again, it can't be done on the fly for database / XPath querying (in database querying we'd be able to make a simple COUNT(1) GROUP BY [Suit] to see if there were any flushes.

[/ QUOTE ]

Can you give a more detailed example of how this relates to performance?

I know nothing about XPath. I figured it was a standard scripting language of some sort tailor made for XML.

Your query above looks like a standard DB query. If [suit] is getting replaced by an actual suit letter then I don't see how reading the 2nd character of a split result is any slower than reading a specific value of an element's attribute.

Am I missing something obvious? Both methods would use "c" in the query if the card were a club, right?

mikechops · #96 08-01-2007, 08:50 PM

[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
I still think I'd prefer cards coactenated. It's trivial to parse them, it takes up much less space and makes the hand a little easier on the eye. But whatever...

[/ QUOTE ]

But they are each a seperate piece of data

[/ QUOTE ]

I've given two advantages of combining them. And the advantage of separating them is?

[/ QUOTE ]

[ QUOTE ]

Concatenating them is illogical as each card is a logical unit, not the two (or more) cards together.

[/ QUOTE ]

But you won't be accessing some cards in issolation. If you want to reference the flop cards, you'll need to access them all, not individually. Similarly for hole cards.

[ QUOTE ]

Concatenating the cards makes it tough to write proper value validation. Keeping the separate would enable validation through a very simple regexp, concatenating them makes this more difficuelt as there could potentially be an unlimited number of cards, making it nonderterministic (making it impossible to validate through DTD for instance).

Concatenating them makes XPath expressions tougher, most XPath engines don't have a "LIKE" operator so it would be cumbersome to simple make a //card[@value='Ac'] lookup to find all aces of clubs.

[/ QUOTE ]

I'm unfamiliar with the languages you mention, but if you can parse and validate one card I don't see why it would be a problem to validate multiple cards. Unless they don't support loops, in which case aren't there better alternatives?

[ QUOTE ]

The space saving of concatenating the cards is neglectable imho. It also makes it simpler to determine the amount of cards simply by the amount of elements, instead of having to parse an attribute.

[/ QUOTE ]

Depends upon the game. Omaha 8 or better, where you get a lot of multiway showdowns with four cards/hand, this is definitely not true.

[ QUOTE ]

Imagine we were to save this XML document in a database (take SQL Server 2005 that supports the XML datatype and inline XPath in SQL), if we were to make a query like "how many cards does the average showdown hand include, combining all games), then we'd have to write complex inline SQL code to count the number of delimiting chars to determine the amount of cards - and this is per hand. It would be a lot easier to just count the number of card elements inline.

[/ QUOTE ]

Why would anyone want to do this?

rvg72 · #97 08-01-2007, 08:53 PM

Great discussion so far. Some point soon the group should decide on a couple of people to collaborate and take a shot at the first draft.

rvg

OrcaDK · #98 08-01-2007, 08:53 PM

Let's say we have an XML document like this:

<root>
<hand />
<hand />
</root>

Where the <root> element contains, let's say, 1000 hands. Now we'd like to see all hands where there were three cards of the same suit on the flop. Using XPath we could make a query that returns all hands where the suit count where = 3 for any suit, thus all cards having the same suit.

Using IndexOf we'd have to loop each <hand> element and process the <flop hand="xx xx xx" /> element to see if the hand was of interest to us.

As for the DB example. Using concatenation we would have a non-normalized table structure like the following (and again we'd have to loop all hands to do the IndexOf processing work):

[Hands]:
HandID | Flop
1 | Ac 8d 9h
2 | Ac 2c 3c

With the non normalized structure there is no easy way to express "all hands with three suited cards for a flop".

If we instead normalize the table to 3rd form:

[Hands]
HandID
1
2

[FlopCards]
HandID | Card | Suit
1 | A | c
1 | 8 | d
1 | 9 | h
2 | A | c
2 | 2 | c
2 | 3 | c

Then we could get all hands having three suited cards for a flop with the following query (not the most optimal query, but it's more readable):
SELECT HandID, COUNT(1) AS SuitCount FROM FlopCards GROUP BY HandID, Suit HAVING SuitCount = 3

OrcaDK · #99 08-01-2007, 09:02 PM

[ QUOTE ]
[ QUOTE ]

Concatenating them is illogical as each card is a logical unit, not the two (or more) cards together.

[/ QUOTE ]

But you won't be accessing some cards in issolation. If you want to reference the flop cards, you'll need to access them all, not individually. Similarly for hole cards.

[/ QUOTE ]
Surely a situation could arise where we would have to access the cards in an isolated manner. This standard would have to support any possible uses in the future, no reason to make a standard that is so functioanlity limited that it wouldn't have any real use for a lot of projects.

[ QUOTE ]
[ QUOTE ]

Concatenating the cards makes it tough to write proper value validation. Keeping the separate would enable validation through a very simple regexp, concatenating them makes this more difficuelt as there could potentially be an unlimited number of cards, making it nonderterministic (making it impossible to validate through DTD for instance).

Concatenating them makes XPath expressions tougher, most XPath engines don't have a "LIKE" operator so it would be cumbersome to simple make a //card[@value='Ac'] lookup to find all aces of clubs.

[/ QUOTE ]

I'm unfamiliar with the languages you mention, but if you can parse and validate one card I don't see why it would be a problem to validate multiple cards. Unless they don't support loops, in which case aren't there better alternatives?

[/ QUOTE ]
XPath is the foremost expression language used to query XML documents. DTD is one of the most widespread type description "languages" available. In most modern cases it's been superceeded by XSD which is far more expressive. For any standard we would have to write a validating XSD schema / DTD document that any submitted hand would have to conform to.

[ QUOTE ]
[ QUOTE ]

Imagine we were to save this XML document in a database (take SQL Server 2005 that supports the XML datatype and inline XPath in SQL), if we were to make a query like "how many cards does the average showdown hand include, combining all games), then we'd have to write complex inline SQL code to count the number of delimiting chars to determine the amount of cards - and this is per hand. It would be a lot easier to just count the number of card elements inline.

[/ QUOTE ]

Why would anyone want to do this?

[/ QUOTE ]
That was just a quick example of a query that would be very cumbersome to perform using the concatenated card form. Other examples are:

The amount of flops containing suited cards. The amount of hands including the card 'Ac'. The amount of hands where 'Ac' was the first card dealt as part of the flop. The amount of flops containing connected cards. And so forth...

mikechops · #**100** 08-01-2007, 09:18 PM

[ QUOTE ]
Great discussion so far. Some point soon the group should decide on a couple of people to collaborate and take a shot at the first draft.

rvg

[/ QUOTE ]

Yeah, I think someone should step up and write a converter. If it works, I think that format will become the standard.

I think it would be a shame if it were unnecessarily bloated. If this standard takes off we could be talking about petabytes of info. E.g. At the moment I am unconvinced we need to store hole or flop cards separately.

Also it would be nice if we had some validation of amounts won and lost in a hand.

But these are minor issues imo. If someone is going to the trouble of writing a converter, I'm certainly not going to quibble about either point.