Friday, September 21, 2012

Split and merge data in URLs

AppId is over the quota
AppId is over the quota

(Who could be better than a Business Intelligence analyst at teaching us a thing or two about Excel? We asked our own number-crunching wizard Stacey Armstrong to share some Excel tricks she's learned along the way. Check out her first trick Resizing column widths in pivot tables. )

I have a list of URLs in one column and their corresponding page views in another. The URLs include a name for a section of the site and the date when the page was published. I'd like to calculate the total number of page views by site section and the total number of page views by publication date. To save time, I want to avoid manually entering this data in new columns. Instead, I plan on using the Convert Text to Columns wizard and Concatenate function. Once you learn the steps, it will take you only a few minutes to complete the tasks.  These steps can also be applied to other situations when you need to split and merge data.

In the sample URLs below, I want to split off the site sections "contactus" and "home." And I want to split off the date that the pages were published "4/15/2012" and "4/11/2012". Since the URLs have slashes / between each of its parts, it will be easy use the Convert Text to Columns wizard (Wizard) to split off the section name and publication date, and turn them into new columns

 http://www.mysite.com/contactus/subpage/4/15/2012/nameofthepage.aspx

http://www.mysite.com/home/subpage/4/11/2012/nameofthepage.aspx  

Select and copy the column with the URLs, then paste into a new column to the right of existing columns.

Highlight this new column, click the Data tab, and in the Data Tools, click Convert Text to Columns wizard.

In the Wizard, choose Delimited, and click Next.


In the next page of the Wizard, uncheck Tab and chose Other.

In the box next to Other, enter a slash /, and then click Finish. (In my example URLs, the slash / separates the parts of the URL, so you can use it to split off the data.)

The site section names and the dates are now split into three columns. Delete the other new columns that the Wizard just created.

Next you'll recombine the date into one cell (month/day/year) by using the Concatenate function. This lets you create a pivot table that pivots by the complete date.

To the right of the three new columns (month, day, and year), enter the Concatenate function in the first blank cell.  =CONCATENATE(D2,"/",E2,"/",F2). Notice that I added "/" between the month/day/year.

The result is 4/11/2010. Drag this formula down to fill the cells in the column.

Highlight this column with the new dates, and copy, then paste values in the same column. This converts your formulas to text.

Delete the Month, Day, and Year columns. Now you'll have two new columns, which you can title Site Section and Date, and two old columns Page and Page Views.

Following these steps lets you split off data from the original URL and add it to the spreadsheet in such a way that lets me create a pivot table.  You can use the pivot table calculate page view totals for site sections and dates.


View the original article here

No comments:

Post a Comment