[Powershell] convertfrom-string: the new way of extracting data?

Robert Haneberg's picture

With the september preview release of Powershell 5.0 from Microsoft a cool new cmdlet called "covertfrom-string" is implemented.

Have you already tried it? You should! We all know these situations in automation or scripting when using data from somewhere. That could be logfiles or the output from external programs. Converting and manipulating is a hard job. Hard work with regular expressions and delimiters is needed to make it fit to our needs in Powershell.

Let's have more insight on "convertfrom-string".

First of all the command is build up on FlashExtract. That technology seems to be build in Excel 2013 first.

If you want to learn more:

http://research.microsoft.com/en-us/um/people/sumitg/pubs/pldi14-flashextract.pdf

The title says something about "Framework for data extraction" and that's all I understood. Too much math. So read it yourself if you're interested :)

But back to our Powershell data problems and maybe a cool way to solve them.

I have a very trivial example with ping.exe. Yep I know you can do it with "test-connection" but then the example wouldn't work. I need external objectless data :

So type in Powershell:

ping.exe www.google.com -4

 

Your output will look like this:

 

ping output

 

Now think about a longjob with maybe 100 or 1000 pings to your server and you want to sort the data by latency or something else. Yep that's where the fun begins. Splitting the data find the right expressions and delimiter to convert it to your needs.

Time to bring in "convertfrom-string"

With convertfrom-string you have the ability to build templates for your data. The technology behind will learn from the template how your data looks and interpret it as you need.

Building the template file (in my case a textfile but could be variable as well) takes a amount of time but it's worth it.

That's how it looks:

 template

 

I just copied the whole output from my ping example to that file. FlashExtract now knows wat the data looks. More examples of data (especially when it could differ) will end in more accuracy. So what I've done next is to define the data I want to extract. Use {PropertyName:Data} to define it. The asterisk * seems to show FlashExtract the beginning of a dataset. The dataset is called "sequence". Without it won't work because you can't have two properties with the same name in one sequence:

 

templateError

 

With the asterisk you tell FlashExtract the beginning of a new data set like in our example a new ping with the same properties but different data.

As you can see in the template I defined the latency property as an integer with [int32]. Same way we do the in Powershell.


So template is ready. Time to fire up the command: 

ping.exe www.google.com -4 | convertfrom-string -templatefile [PathtoTemplate]

 standard output

 

Yeah ok that's not exactly what we want. The cmdlet will automatically create the "ExtentText" Property where the original Data is stored. That could be a "feature" of the Powershell preview version.

But we can handle that with "select-object" and make it beautiful GUI-ish by using “out-gridview”: 

ping.exe www.google.com -4 | convertfrom-string -templatefile [PathToFile] | select-object IP,Bytes,Lat,TTL | out-gridview

 

output in out-gridview

 

Nice, isn't it? We now have the data more userful in less time. Afterwards we can store and use it as we want.

The ping example is a very easy one just to get in touch with "convertfrom-string".

If you have trouble with "convertfrom-string" interpreting your outputs just add more variations of example data:

 extended template

 

That solved most of my problems.

But beware it is just the preview yet. There can be couple of bugs and errors as always :)

If you want to try the Powershell 5.0 preview you can get it here:

 

http://www.microsoft.com/en-us/download/details.aspx?id=44987

Add new comment
By submitting this form, you accept the Mollom privacy policy.