Data Driven Web Automation

One of WinTask's strengths is the ability to record a specific web based task and to modify the resultant script to expand the scope of the task. Topic Capturing Web Data Items illustrated how to extract data from a web page. The usefulness of the generated script is limited by the hard coded query information sent to the web site. A Data Driven script uses a list of query information to repetitively query the web site building a much larger set of data.

This topic provides a tutorial of how to record a specific web based task and to transform the script into a data driven script. The requirements for the data driven script are as follows:

Using the web site www.marketwatch.com, read the current stock price quotes for a list of stocks from an Excel spreadsheet and store the results in the same spreadsheet. The list of stock ticker symbols will be located in column A and the associated price quote is to be stored in column B. The maximum number of stock ticker symbols listed in column A of the spreadsheet that need to be processed is 100. The script will stop processing stock ticker symbols on the first empty cell in column A if the maximum number is not reached.

Generate a script to use as a template

We start by recording the extraction of a stock quote for ticker symbol AES to generate script that will be used as the basis for the data driven script.

  1. Start WinTask. If the Your First Script Wizard dialog box is displayed, click the Close button. The WinTask Editor window should now be active.

  2. From the WinTask toolbar, click the Rec button rec.jpg to start recording your actions.

  3. The Start Recording Mode dialog box will appear asking What do you want to start before recording?. Select the Internet Explorer radio button and click the OK button.

  4. In the following dialog box, Launching Internet Explorer, type "www.marketwatch.com/tools/quotes/profile.asp" into the Web address text field and click the OK button.

  5. Once the MarketWatch.com web page loads, type "AES" into the Enter Symbols, Fund, Keyword field and click the arrow icon to the right of the field.

  6. The page will reload with the stock information for stock ticker symbol AES. The data we're interested is the quote in the section titled Quotes are delayed.... The section can be found near the top of the web page.

  7. Click the Capture button on the floating WinTask toolbar. The Capture button is the third button from the left on the toolbar with the "T and magic wand" icon.

 

  1. The first screen of the Capture Wizard subtitled Specify the window where the data to be captured are is displayed. Click the Spy button. The mouse cursor changes to a "Crosshair within a circle". Move the cursor over the area on the web page labeled Quotes are delayed... displaying the quote and volume. When you see the selection rectangle around the table, click the left mouse button to capture the data within the table.

  2. The screen subtitled Specify the HTML element where the data to be captured are comes back and shows the captured data. When the script is replayed, WinTask uses the HTML Descriptor of the HTML object to locate the data content for capture. A suggested HTML Descriptor is displayed in the HTML Descriptor field. In our example, the suggested HTML Descriptor, "TABLE[CONTENT='After']", needs to be modified slightly. Press the Change proposed content button.

  3. The Content validation dialog is now opened. The proposed HTML Descriptor is "After" and we would like "AES". However, "AES" is displayed many times on the page, so to have a discriminant content, we need to change the value of the CONTENT property to "AES /quotes". In the Captured text field, select "AES/quotes", copy those words into the Change proposed content text field located near the bottom of the dialog.

  4. Press the Validate button. The values listed in the Descriptor and Proposed content static text fields are updated. Press the OK button to accept the new HTML Descriptor and close the dialog. Click Next button.

  5. The screen subtitled Take only some of the captured data is displayed. We want to capture only the number between $ and Change. Uncheck the checkbox The beginning and in the Between field, type "$". Uncheck the checkbox The end and in the And field, type "Change". Click Show extracted data button to check the extracted string. Click Paste into the script button to close the Capture Wizard.

  6. Close Internet Explorer.

  7. Stop Recording Mode by clicking the Stop button on the floating WinTask toolbar. The Stop button is the first button on the left side of the toolbar with the "X and rectangle" icon.

  8. The WinTask Editor window is now restored and the script statements generated during Recording Mode are inserted into the current script document window.

The following statements were generated during Recording mode. Comments have been added to explain each script statement. The MsgBox function at the end of the script has been added manually to verify that the script correctly captures the stock price quote.

 

' Start Internet Explorer and load the web page.

StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")

 

' Wait for the marketwatch.com stock quotes web page to load.

UsePage("MarketWatch.com")

 

' The next two statements enter the stock ticker symbol into the text

' field and clicks the arrow button respectively requesting the price

' of the stock.

WriteHTML("INPUT TEXT[NAME= 'symb']", "AES")

ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")

 

' Wait for the page to be updated with the stock price quote.

UsePage("AES Profile")

 

' Capture the stock price quote from DIV HTML object on the web page.

ret = CaptureHTML("DIV[CONTENT='AES " + Chr$(13) + Chr$(10) + "/quotes']",captured_string$)

 

' Extract only the price.

result_string$ = ExtractBetween$(captured_string$,"$",excluded,"Change",excluded)

 

 

' Close Internet Explorer

CloseWindow("IEXPLORE.EXE|IEFrame|AES Profile",1)

 

 

' THESE LINES ADDED MANUALLY TO VERIFY OPERATION OF SCRIPT

' Display the stock price quote captured from the web page.

MsgBox(result_string$, 0, "Price for Stock Ticker Symbol: AES")

 

Add the MsgBox function if you wish to test the script. Save the script using the name "C:\quote_AES.src". Compile and execute the script. Observe the price of the stock as the script executes. The price displayed in the message box will match the value observed on the web page.

Update the script to use a string variable for the Stock Ticker Symbol

The next step in developing a Data Driven Web Automation script is to replace all instances of the hard coded stock ticker symbol AES with a string variable. We also take this opportunity to change the name of the captured string to a string array to prepare the next step. All changes from the previous script are shown in RED.

 

' Allocate a string array to hold the data captured from the web page.

' NOTE #1: The array can store 5 elements indexed from 0 to 4.

' NOTE #2: The capacity of the array can be as small as 1.

Dim CapturedData$(5)

 

' Start Internet Explorer and load the web page.

StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")

 

' Define a string variable to store the Stock Ticker Symbol

TickerSymbol$ = "AES"

 

' Wait for the marketwatch.com stock quotes web page to load.

UsePage("MarketWatch.com")

 

' The next two statements enter the stock ticker symbol into the text

' field and clicks the arrow button respectively requesting the price

' of the stock.

WriteHTML("INPUT TEXT[NAME= 'symb']", TickerSymbol$)

ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")

 

' Wait for the page to be updated with the stock price quote.

' As the page title starts with the stock ticker symbol,

' the UsePage parameter (the title of the page) has to be dynamically built.

' The concatenation operator is the +

UsePage(TickerSymbol$+" "+"Profile")

 

' Capture the stock price quote from the DIV HTML object on the web page.

' Again the CONTENT keyword has to be built dynamically.

ret = CaptureHTML("DIV[CONTENT='"+TickerSymbol$+" "+ Chr$(13) + Chr$(10) + "/quotes']",captured_string$)

 

' Extract only the price.

result_string$ = ExtractBetween$(captured_string$,"$",excluded,"Change",excluded)

' Fill the first element of the array with the extracted price
CapturedData$(0)=result_string$

 

' Close Internet Explorer - Truncate the window name to delete the reference to AES

CloseWindow("IEXPLORE.EXE|IEFrame|",1)

 

 

' THESE LINES ADDED MANUALLY TO VERIFY OPERATION OF SCRIPT

' Display the stock price quote captured from the web page.

MsgBox(CapturedData$(0), 0, "Price for Stock Ticker Symbol: "+TickerSymbol$)

 

Most of the changes to the script are straightforward. Of special note however is the change made to the CaptureHTML function. The HTML Descriptor has been modified from a simple string to three strings concatenated together with the "+" operator. Be careful not to drop the single quotes from the resultant string or the script will not replay correctly.

Save the script using the name "C:\quote_variable.src". Compile and execute the script. This version of the script exhibits the same behavior as the first script. The message box displayed at the end of the script will report the same results for AES as the first script.

Update the script to read the Stock Ticker Symbols from a spreadsheet

This step will read a number of stock ticker symbols from an Excel spreadsheet and look up the stock price quote for the first stock ticker symbol listed in the spreadsheet. All changes from the previous script are shown in RED.

 

' Allocate a string array to hold the data captured from the web page.

' NOTE #1: The array can store 5 elements indexed from 0 to 4.

' NOTE #2: The capacity of the array can be as small as 1.

Dim CapturedData$(5)

 

' Allocate an array of 100 strings to store the stock ticker symbols.

' NOTE: The array can store 100 elements indexed from 0 to 99.

Dim StockTickerSymbols$(100)

 

' Define a variable to index into both string arrays.

StockIndex = 0

 

' Define string variables relating to the Excel spreadsheet file

ExcelSpreadsheet$ = "C:\DailyStockQuotes.xls"

CellsForSymbols$ = "A1:A100" ' Maximum of 100 stock ticker symbols

 

' Read the list of stock ticker symbols from the spreadsheet.

' The data will be read into the symbols string array.

ReadExcel(ExcelSpreadsheet$, CellsForSymbols$, StockTickerSymbols$())

 

' Start Internet Explorer and load the web page.

StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")

 

' Define a string variable to hold the current stock ticker symbol

TickerSymbol$ = StockTickerSymbols$(StockIndex)

 

' Wait for the www.marketwatch.com stock quotes web page to load.

UsePage("MarketWatch.com")

 

' The next two statements enter the stock ticker symbol into the text

' field and clicks the arrow button respectively requesting the price

' of the stock.

WriteHTML("INPUT TEXT[NAME= 'symb']", TickerSymbol$)

ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")

 

' Wait for the page to be updated with the stock price quote.

UsePage(TickerSymbol$+" "+"Profile")

 

' Capture the stock price quote from the DIV HTML object on the web page.

ret = CaptureHTML("DIV[CONTENT='"+TickerSymbol$+" "+ Chr$(13) + Chr$(10) + "/quotes']",captured_string$)

' Extract only the price.

result_string$ = ExtractBetween$(captured_string$,"$",excluded,"Change",excluded)

' Fill the first element of the array with the extracted price
CapturedData$(0)=result_string$

 

 

' Close Internet Explorer

CloseWindow("IEXPLORE.EXE|IEFrame|",1)

 

 

' THESE LINES ADDED MANUALLY TO VERIFY OPERATION OF SCRIPT

' Display the stock price quote captured from the web page.

MsgBox(CapturedData$(0), 0, "Price for Stock Ticker Symbol: "+TickerSymbol$)

' Display the second and third symbols read from the spreadsheet.

MsgBox(StockTickerSymbols$(1), 0, "Second spreadsheet Stock Ticker Symbol")

MsgBox(StockTickerSymbols$(2), 0, "Third spreadsheet Stock Ticker Symbol")

 

Save the script using the name "C:\quote_read_excel.src". Open Excel and type the stock ticker symbols AES, ANF, and ALB into cells A1 through A3 inclusive. Save the spreadsheet using the name "C:\DailyStockQuotes.xls".

Return to the WinTask editor. Compile and execute the script. The first message box displayed at the end of the script will report the same results for AES as the two previous scripts in this topic. The succeeding message boxes will report the results for ANF and ALB respectively.

Update the script to add iteration

Now that we have a list of stock ticker symbols in memory, its time to add iteration to the script. Iteration will allow the script to read the stock prices quotes for more symbols than just the single symbol of the previous scripts. All changes from the previous script are shown in RED.

 

' Allocate a string array to hold the data captured from the web page.

' NOTE #1: The array can store 5 elements indexed from 0 to 4.

' NOTE #2: The capacity of the array can be as small as 1.

Dim CapturedData$(5)

 

' Allocate two arrays of 100 strings to store the stock ticker

' symbols and the stock price quotes respectively.

' NOTE: Both arrays can store 100 elements indexed from 0 to 99.

Dim StockTickerSymbols$(100)

Dim StockPriceQuotes$(100)

 

' Define the maximum index for the string arrays.

MAX_ARRAY_INDEX = 100

 

' Define a variable to index into both string arrays.

StockIndex = 0

 

' Define string variables relating to the Excel spreadsheet file

ExcelSpreadsheet$ = "C:\DailyStockQuotes.xls"

CellsForSymbols$ = "A1:A100" ' Maximum of 100 stock ticker symbols

 

' Read the list of stock ticker symbols from the spreadsheet.

' The data will be read into the symbols string array.

ReadExcel(ExcelSpreadsheet$, CellsForSymbols$, StockTickerSymbols$())

 

' Start Internet Explorer and load the web page.

StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")

 

' Define a string variable to hold the current stock ticker symbol

TickerSymbol$ = StockTickerSymbols$(StockIndex)

 

' Wait for the www.marketwatch.com stock quotes web page to load.

UsePage("MarketWatch.com")

 

' The next two statements enter the stock ticker symbol

' into the text field and clicks the arrow button respectively

' requesting the price of the stock.

WriteHTML("INPUT TEXT[NAME= 'symb']", TickerSymbol$)

ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")

 

' Capture stock price quotes as long we have a stock ticker symbol

repeat

 

' Wait for the page to be updated with the stock price quote.

UsePage(TickerSymbol$+" "+"Profile")

' Capture the stock price quote from the DIV HTML object on the web page.

ret = CaptureHTML("DIV[CONTENT='"+TickerSymbol$+" "+ Chr$(13) + Chr$(10) + "/quotes']",captured_string$)

' Extract only the price.

result_string$ = ExtractBetween$(captured_string$,"$",excluded,"Change",excluded)

' Fill the first element of the array with the extracted price
CapturedData$(0)=result_string$

 

' Copy the captured data into the stock price quotes array

StockPriceQuotes$(StockIndex) = CapturedData$(0)

 

' Get the next stock ticker symbol from the array

StockIndex = StockIndex + 1

If (StockIndex < MAX_ARRAY_INDEX) Then

TickerSymbol$ = StockTickerSymbols$(StockIndex)

Else

TickerSymbol$ = "" ' Reached end of array

EndIf

 

' Write the new Symbol and click Go button

WriteHTML("INPUT TEXT[ID= 'autocomplete_input']",TickerSymbol$)
ClickHTMLElement("DIV[OUTERTEXT= 'GO',INDEX='2']")

'Wait for the page to be updated with the new stock ticker symbol.
UsePage(TickerSymbol$)
'Click the Profile Tab to get the Profile information within the page.
ClickHTMLElement("A[INNERTEXT= 'Profile']")

 

until (TickerSymbol$ <> "")

 

' Close Internet Explorer

CloseWindow("IEXPLORE.EXE|IEFrame|",1)

 

 

' THESE LINES ADDED MANUALLY TO VERIFY OPERATION OF SCRIPT

' Display the stock price quotes captured from the web page.

StockIndex = 0

While (StockIndex < 3)

MsgBox(StockPriceQuotes$(StockIndex), 0, "Price for Stock Ticker Symbol: "+StockTickerSymbols$(StockIndex))

StockIndex = StockIndex +1

Wend

 

Save the script using the name "C:\quote_iteration.src". Compile and execute the script. This script uses the Excel spreadsheet created for the previous script, "C:\DailyStockQuotes.xls".

A total of three message boxes will be displayed. The first will report the results for AES, the second for ANF and the third for ALB. The message boxes should be identical to the previous script in this topic.

Update the script to store the price quotes in the spreadsheet

The final change to the script is to store the accumulated stock price quotes into the Excel spreadsheet. All changes from the previous script are shown in RED.

 

' Allocate a string array to hold the data captured from the web page.

' NOTE #1: The array can store 5 elements indexed from 0 to 4.

' NOTE #2: The capacity of the array can be as small as 1.

Dim CapturedData$(5)

 

' Allocate two arrays of 100 strings to store the stock ticker

' symbols and the stock price quotes respectively.

' NOTE: Both arrays can store 100 elements indexed from 0 to 99.

Dim StockTickerSymbols$(100)

Dim StockPriceQuotes$(100)

 

' Define the maximum index for the string arrays.

MAX_ARRAY_INDEX = 100

 

' Define a variable to index into both string arrays.

StockIndex = 0

 

' Define string variables relating to the Excel spreadsheet file

ExcelSpreadsheet$ = "C:\DailyStockQuotes.xls"

CellsForSymbols$ = "A1:A100" ' Maximum of 100 stock ticker symbols

CellsForQuotes$ = "B1:B100" ' Maximum of 100 stock price quotes

 

' Read the list of stock ticker symbols from the spreadsheet.

' The data will be read into the symbols string array.

ReadExcel(ExcelSpreadsheet$, CellsForSymbols$, StockTickerSymbols$())

 

' Start Internet Explorer and load the web page.

StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")

 

' Define a string variable to hold the current stock ticker symbol

TickerSymbol$ = StockTickerSymbols$(StockIndex)

' Wait for the marketwatch.com stock quotes web page to load.

UsePage("MarketWatch.com")

 

' The next two statements enter the stock ticker symbol

' into the text field and clicks the arrow button respectively

' requesting the price of the stock.

WriteHTML("INPUT TEXT[NAME= 'symb']", TickerSymbol$)

ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")

 

 

' Capture stock price quotes as long we have a stock ticker symbol

While (TickerSymbol$ <> "")

 

'' Wait for the page to be updated with the stock price quote.

UsePage(TickerSymbol$+" "+"Profile")

' Capture the stock price quote from the DIV HTML object on the web page.

ret = CaptureHTML("DIV[CONTENT='"+TickerSymbol$+" "+ Chr$(13) + Chr$(10) + "/quotes']",captured_string$)

' Extract only the price.

result_string$ = ExtractBetween$(captured_string$,"$",excluded,"Change",excluded)

' Fill the first element of the array with the extracted price
CapturedData$(0)=result_string$

 

' Copy the captured data into the stock price quotes array

StockPriceQuotes$(StockIndex) = CapturedData$(0)

 

' Get the next stock ticker symbol from the array

StockIndex = StockIndex + 1

If (StockIndex < MAX_ARRAY_INDEX) Then

TickerSymbol$ = StockTickerSymbols$(StockIndex)

Else

TickerSymbol$ = "" ' Reached end of array

EndIf

 

Wend

 

' Close Internet Explorer

CloseWindow("IEXPLORE.EXE|IEFrame|",1)

 

' Write the captured stock prices quotes into the spreadsheet

WriteExcel(ExcelSpreadsheet$, CellsForQuotes$, StockPriceQuotes$())

 

Save the script using the name "C:\quote_data_driven.src". Compile and execute the script. This script uses the Excel spreadsheet created for the previous scripts, "C:\DailyStockQuotes.xls".

The captured stock price quotes are now saved in the Excel spreadsheet. When the spreadsheet is inspected, the price quote of each stock is listed in column B next to its respective stock ticker symbol.