Another thing I'd like to do with my app is to incorporate frequency of usage. This will be used to group the vocabulary into game levels, starting with the most widely used words first. I have an Open Office spreadsheet of the word frequency of use, and five files in XML corresponding to five different levels of test. So, my goal for today is to smush the frequency of use into the XML, or, more likely, convert the XML into SQL and then add it in.
I'll be honest - the easiest database to work with in my experience is Microsoft Access. You can really get things done in a jiffy with the query builder. But, I'm on a mac now, and while I could bring up the VM and Access, the VM really slows things down. Another database front end I've worked quite a bit with is phpMyAdmin, particularly in a PHP course taught by the amazing David Malan at Harvard. So, my goal for now is simply to get the XML file into any SQL file. I had originally used XML for the file because the example application I was initially working off of used XML, and I was able to generate the XML automatically after pasting a word list into Excel. Since I want to leave open the possibility I will get dynamic with my data base (user, high scores, current level, wrong guesses, etc), and since I'm more familiar with SQL, I'll just go with that option.
So, the first task on my list will be to simply convert the current database I have from XML to sql. It's a pretty simply file. There should be some tools out there to help do this quickly. If not, I could probably do a global find and replace to create the insert sql statements. Let's do a quick google on that. Well, someone just did it with XML to csv, and from there to the db. I wonder if I should just start working directly with SQLLite, because that person imported it using the SQLLite Manager. Naw. In the interests of a kind of test-first mentality, my logical test will be: get the XML into ANY sql database; don't encumber the task.
Here's an example of the current XML.
I'm going to give a try at importing into an open office spreadsheet, and breaking it up by column. Let's give it a try. First, I'll create a "sandbox" by copying the the files from eclipse into a tempory folder on my desktop called "convert_xml". Now, let's open it with calc....
Unfortunately, unlike Excel, it doesn't give the option of separating the imported data into columns. How about that XML to CSV utility I saw somewhere? Well, it's pay. Well, actually, my wife's pc has excel, so I'll open it up on hers. Sure enough, it breaks it nicely into columns. I'll create the five spreadsheets for the five data files (all the same format). Oddly, it's a hassle to get them to append in Excel in one huge fiel. Well, that will be simple enough with Access, which I as I mentioned I have on my VM. Fire that baby up. It's slow, but not that slow. With access, I'll suck all five spreadsheets into a table, and take it from there.
Actually, it's pretty slow, slower than remembered. While I was waiting for it to get all started up, I opened one of the spreadsheets from excel into calc, then exported into to CSV file - and it works. That looks pretty good to me right now. I wonder if I can pull it right into a database with myphpeadmin. Let's try, I have it up and running at my provider. Hmmm...oddly, I'm having trouble finding the create database option.
Ok, after some hassling around, I've got the data created for level 5. Now for my next act, I will try to append level 4 to the same table. Otherwise, I'll have to do some kind of union query. No, wait. All I have to do is copy and paste all the csv files into a single file and I'm good to go. Wrangler, btw, has been nicely showing all those nasty Japanese characters in utf8.
Let's get these exports to csv done. Done. Now, to make things even quicker, I might do a unix cat command. There - I got rid of the headers except for level5, and ran this command: cat level5.csv level4.csv level3.csv level2.csv level1.csv > all.csv
Ok, let's do another import - and it's done. Although I had to drop the previous table to get the dialog that lets you use the first row as column names. Hmmm...the table could use a unicode value. Well, this app just really needs the level and the sequence number within the level.
Ok. What next? I'm starting to feel like I need to start working with SQLLite, which is the db for both Android and the Mac. Onto the next post!
No comments:
Post a Comment