Hey, hi, how are ya. I wouldn't be Gypsy if I wasn't trying to parse something so here is my latest parse and insight on how to critically think. The thing about parsing is, it's always a puzzle, some puzzles (parsing .pls files) are very very simple and others can be quite a bit more complicated (parsing JSON). I had an idea and to make it happen I needed to switch gears and go from parsing delimiters to parsing file directories.
Let's fill in some blanks first. I have a Zip library that does everything under the sun, except give me an actual directory structure. Instead of:
folder/
..subfolder/
....file
I get:
"folder/subfolder/file"
essentially you could say, instead of giving me a structure it spits out all the files as if their entire path is their name. This simply will not do. I didn't write the Zip library and I decided it would be much easier to extend the library than to track down and change one of it's primary functions.
This led to a whole slew of problems. The ultimate goal is to have an Object that accurately represents the entire directory/file structure. Below is an example of the results I wanted
this way I can get files like this in my code
entire_zip.folder2.folder55.file82
Now I will officially explain how I solved this puzzle.
The first thing to consider is whether or not something already exists. folder55 is a good example. Technically there are 2 paths that will have folder55 and it's imperative that we create it if it doesn't exist and skip it if it does. This is when I had the idea to make a reverse comparison stack (just made that up).
Essentially what I do is take the full path that the zip provides me with and I chop it into an array on every "/". So "folder1/file1" becomes array("folder1","file1"). I then loop backwards through the array creating every stage of possibilities. Let make this our example path:
"folder1/folder2/folder3/file"
I chop that into
array(folder1,folder2,folder3,file)
I then use another array to store results
storage[0] = array[3] //file contents
storage[1] = Object()
storage[1][array[2]] = storage[0]; //folder object and file contents
storage[2] = Object()
storage[2][array[1]] = storage[1] //both folders and file contents
storage[3] = Object()
storage[3][array[0]] = storage[2] //all folders and file contents
you see? each index of the array is a completed portion from it's position clean out to the file contents. That's half of the trick. Now we have to compare our final object to each array position until we get one that doesn't already exist.
That's easier said than done though. Obviously every time we compare if there is an existence we are also obligated to assimilate that new nest if it already exists.
for instance
Object[someName] (does this equal) array[2] = for examples it does
now we have to become Object[someName] so we can check from there if the next array index is a match. If we did this
Object = Object[somename] we would actually be truncating our main object... saying "now you only equal yourself from this point". This is where reference is key.
temp = @Object
now temp points to Object and when we want to go a nest deeper we just tell temp to act as a pointer to that new nest. Voila' our object gets populated by proxy.
This was very hard to try and explain simply. I know I didn't do the best job. It is not clear at all, by any means that a reverse array stack comparison assignment by reference loop was the solution to my problem. As a matter of fact, before yesterday I had never even conceived of such a thing.
My success in achieving my goal and concocting such a system was derived from NOT relying on my knowledge. If I would have relied on my knowledge who knows how many damn loops and crazy shit I would have made to finally get my results. My current script is only 134 lines, easily 50 of those are loading the zip and package declarations/imports/etc. So, 84 lines (approx) to PERFECTLY parse files and their directory structure into an object.
When I first set out to solve this problem I was thinking it was going to be hundreds upon hundreds of lines, and I even began writing hundreds upon hundreds of lines. A little coffee, some secret agent music and about 30 minutes of staring at the wall - gave me ideas that shortened the script substantially.
Let's fill in some blanks first. I have a Zip library that does everything under the sun, except give me an actual directory structure. Instead of:
folder/
..subfolder/
....file
I get:
"folder/subfolder/file"
essentially you could say, instead of giving me a structure it spits out all the files as if their entire path is their name. This simply will not do. I didn't write the Zip library and I decided it would be much easier to extend the library than to track down and change one of it's primary functions.
This led to a whole slew of problems. The ultimate goal is to have an Object that accurately represents the entire directory/file structure. Below is an example of the results I wanted
Code:
entire_zip: { folder1: { file1:contents, file2:contents }, folder2: { folder55: { file82:contents, file99:contents } } }
entire_zip.folder2.folder55.file82
Now I will officially explain how I solved this puzzle.
The first thing to consider is whether or not something already exists. folder55 is a good example. Technically there are 2 paths that will have folder55 and it's imperative that we create it if it doesn't exist and skip it if it does. This is when I had the idea to make a reverse comparison stack (just made that up).
Essentially what I do is take the full path that the zip provides me with and I chop it into an array on every "/". So "folder1/file1" becomes array("folder1","file1"). I then loop backwards through the array creating every stage of possibilities. Let make this our example path:
"folder1/folder2/folder3/file"
I chop that into
array(folder1,folder2,folder3,file)
I then use another array to store results
storage[0] = array[3] //file contents
storage[1] = Object()
storage[1][array[2]] = storage[0]; //folder object and file contents
storage[2] = Object()
storage[2][array[1]] = storage[1] //both folders and file contents
storage[3] = Object()
storage[3][array[0]] = storage[2] //all folders and file contents
you see? each index of the array is a completed portion from it's position clean out to the file contents. That's half of the trick. Now we have to compare our final object to each array position until we get one that doesn't already exist.
That's easier said than done though. Obviously every time we compare if there is an existence we are also obligated to assimilate that new nest if it already exists.
for instance
Object[someName] (does this equal) array[2] = for examples it does
now we have to become Object[someName] so we can check from there if the next array index is a match. If we did this
Object = Object[somename] we would actually be truncating our main object... saying "now you only equal yourself from this point". This is where reference is key.
temp = @Object
now temp points to Object and when we want to go a nest deeper we just tell temp to act as a pointer to that new nest. Voila' our object gets populated by proxy.
This was very hard to try and explain simply. I know I didn't do the best job. It is not clear at all, by any means that a reverse array stack comparison assignment by reference loop was the solution to my problem. As a matter of fact, before yesterday I had never even conceived of such a thing.
My success in achieving my goal and concocting such a system was derived from NOT relying on my knowledge. If I would have relied on my knowledge who knows how many damn loops and crazy shit I would have made to finally get my results. My current script is only 134 lines, easily 50 of those are loading the zip and package declarations/imports/etc. So, 84 lines (approx) to PERFECTLY parse files and their directory structure into an object.
When I first set out to solve this problem I was thinking it was going to be hundreds upon hundreds of lines, and I even began writing hundreds upon hundreds of lines. A little coffee, some secret agent music and about 30 minutes of staring at the wall - gave me ideas that shortened the script substantially.
Comment