Getting into Mobile: Frequency

While I was posting to textmate while waiting for blogger to come back up - my mac crashed!

Here's what was saved:

I need to append the two frequency views together. First, let's figure out how to append one file to the next. Google shows something like:

INSERT INTO newtable (fields..)
SELECT fields.. FROM OldTable
If I follow that, I will need to create a table from the first view. Then insert into the a second table from that. So, how do I insert into the first table? The same way, I guess.

So, let's change it to this:

INSERT INTO all_words_temp1 (level, number, kanji, hiragana, english, _id, freq)
SELECT level, number, kanji, hiragana, english, _id, freq
FROM v_all_words_freq_kanji_match;

It needs a pre-existing table

CREATE TABLE all_words_freq_outer
(level INTEGER,
freq INTEGER,
number INTEGER,
kanji TEXT(25),
hiragana TEXT(25),
english TEXT(50),
“_id” INTEGER)

Ok, that's what was preserved. But, I actually I got as far as appending the two views into all_words_temp1. Then, Iwent through some shenanigans to insert the remaining, unmatched (on frequency) rows into the table.

I then looked at the remaining unmatched ones, and see if there's a way to match the remaining:

SELECT level, number, kanji, hiragana, english, _id, freq from all_words_temp1 where freq is null;

This gives me 245 entries (I previously deleted 3 invalid header entries which had snuck in there).

I made a backup of the all_words_temp1 table.

Then, I created a view which only pulled out those with nulls in freq with the honorable prefix "お", called v_o_prefix.

I'm having a problem figuring out what's in some of the views.

Ah, show info in RazorSQL shows it:

CREATE VIEW v_o_prefix as SELECT level, number, kanji, hiragana, english, _id, SUBSTR (hiragana, 2, 999) as hiragana_no_prefix from all_words_temp1 where freq is null and hiragana like 'お%';

Ok, so the o_prefix does strip the first 'お%' - and freq is null and it's from all_words_temp1. Good.

I was looking at the hiragana instead of the hiragana_no_prefix field.

Now I can do this to get a match on freq once the honorific prefix is stripped:

SELECT level, number, kanji, hiragana, english, _id, freq_list.freq
from v_o_prefix, freq_list
where v_o_prefix.hiragana_no_prefix = freq_list.word

There we go. I'm finally back to pre-crash mode. It turns out there are only 10 of these. Let's add this as a view:

create view v_o_prefix_freq_match as SELECT level, number, kanji, hiragana, english, _id, freq_list.freq
from v_o_prefix, freq_list
where v_o_prefix.hiragana_no_prefix = freq_list.word

Now we need to put them in the all_words_temp1 table. Let's take another stab at this construction:

update TableA set A2 =
(select B2 from TableB where A1=B1)
where A1 in (select B1 from TableB);

// this part is easy

update TableA set A2 =

// B2 is what we're getting from table B. And A1/B1 are the matching values. In this case, we can just match on anything, like _id. The A1 is passed from the below select statement.

(select B2 from TableB where A1=B1)

// this statement will give the above any A1's which are found in b.

where A1 in (select B1 from TableB);

Let's tackle the last one first:

where _id in (select _id from v_o_prefix_freq_match);

So, we'll get those 10 ids. Next the middle line:

(select v_o_prefix_freq_match.freq from v_o_prefix_freq_match where all_words_temp1._id=v_o_prefix_freq_match._id)

Finally, the first line:

update all_words_temp1 set freq =

Let's put them all together.

update all_words_temp1 set freq =
(select v_o_prefix_freq_match.freq from v_o_prefix_freq_match where all_words_temp1._id=v_o_prefix_freq_match._id)
where _id in (select _id from v_o_prefix_freq_match);

Let's try out this monster.

Well, it ran, but did it work?

select * from v_o_prefix

29 records. It was 39 before.

select * from v_o_prefix_freq_match

No records found. Whew - a lot of work for 10 records. But maybe I can apply it to other patterns.

Getting into Mobile

Friday, May 13, 2011

Frequency - next steps

1 comment: