Data Driven Web Automation
One of WinTask's strengths is the ability to record a specific web based task and to modify the resultant script to expand the scope of the task. Topic Capturing Web Data Items illustrated how to extract data from a web page. The usefulness of the generated script is limited by the hard coded query information sent to the web site. A Data Driven script uses a list of query information to repetitively query the web site building a much larger set of data.
This topic provides a tutorial of how to record a specific web based task and to transform the script into a data driven script. The requirements for the data driven script are as follows:
Using the web site www.marketwatch.com, read the current stock price quotes for a list of stocks from an Excel spreadsheet and store the results in the same spreadsheet. The list of stock ticker symbols will be located in column A and the associated price quote is to be stored in column B. The maximum number of stock ticker symbols listed in column A of the spreadsheet that need to be processed is 100. The script will stop processing stock ticker symbols on the first empty cell in column A if the maximum number is not reached.
Generate a script to use as a template
We start by recording the extraction of a stock quote for ticker symbol AES to generate script that will be used as the basis for the data driven script.
Start WinTask. If the Your First Script Wizard dialog box is displayed, click the Close button. The WinTask Editor window should now be active.
From the WinTask toolbar, click the Rec button
to start recording your actions.
The Start Recording Mode dialog box will appear asking What do you want to start before recording?. Select the Internet Explorer radio button and click the OK button.
In the following dialog box, Launching Internet Explorer, type "www.marketwatch.com/tools/quotes/profile.asp" into the Web address text field and click the OK button.
Once the Profile web page loads, type "AES" into the Change Symbol field and click the arrow icon to the right of the field.
The page will reload with the stock information for stock ticker symbol AES. The data we're interested is in the section titled Aes Corp (NYSE) Delayed quote data. The section can be found near the top of the web page.
Click the Capture button on the floating WinTask toolbar. The Capture button is the third button from the left on the toolbar with the "T and magic wand" icon.
The first screen of the Capture Wizard subtitled Specify the window where the data to be captured are is displayed. Click the Spy button. The mouse cursor changes to a "Crosshair within a circle". Move the cursor over the area on the web page labeled Aes Corp (NYSE) Delayed quote data. When you see the selection rectangle around the table, click the left mouse button to capture the data within the table.
Press the Next button.
The screen subtitled Specify the HTML element where the data to be captured are is displayed. When the script is replayed, WinTask uses the HTML Descriptor of the table to locate the data content for capture. A suggested HTML Descriptor is displayed in the HTML Descriptor field. In our example, the suggested HTML Descriptor, "TABLE[CONTENT='Aes']", needs to be modified slightly. Press the Change proposed content button.
The Content validation dialog is now opened. We originally typed in "AES" but the proposed HTML Descriptor is "Aes". We need to change the value of the CONTENT property to ensure that the correct the table is located on the web page. Type "AES" into the Change proposed content text field located near the bottom of the dialog.
Press the Validate button. The values listed in the Descriptor and Proposed content static text fields are updated. Press the OK button to accept the new HTML Descriptor and close the dialog.
Press the Next button.
The Capture Wizard screen subtitled Select the data you want to capture is displayed. For our purposes, we are only interested in the stock ticker price. Select cell R3C2 and then click the Next button.
The screen subtitled Specify where to copy the captured data is displayed. In our example, we will be temporarily storing the stock ticker price in an array before writing it to an Excel format file. Select the Array radio button and then click the Next button.
The screen subtitled Capture finished is displayed. Click the Paste into the script button to close the Capture Wizard.
Close Internet Explorer.
Stop Recording Mode by clicking the Stop button on the floating WinTask toolbar. The Stop button is the first button on the left side of the toolbar with the "X and rectangle" icon.
The WinTask Editor window is now restored and the script statements generated during Recording Mode are inserted into the current script document window.
The following statements were generated during Recording mode. Comments have been added to explain each script statement. The MsgBox function at the end of the script has been added manually to verify that the script correctly captures the stock price quote.
' Allocate an array of 100 strings to store the stock ticker prices.
' NOTE: The array can store 100 elements indexed from 0 to 99.
Dim tabcell_0$(100)
' Start Internet Explorer and load the web page.
StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")
' Wait for the www.marketwatch.com stock quotes web page to load.
UsePage("Profile")
' The next two statements enter the stock ticker symbol into the text
' field and clicks the arrow button respectively requesting the price
' of the stock.
WriteHTML("INPUT TEXT[NAME= 'symb']", "AES")
ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")
' Wait for the page to be updated with the stock price quote.
UsePage("Profile")
' Capture the stock price quote from the table on the web page.
' A single cell from the table is captured and written into the
' first element of the temporary string array.
ret = CaptureTableHTML("TABLE[CONTENT='AES']", "R3C2:R3C2", tabcell_0$())
' Close Internet Explorer
CloseWindow("IEXPLORE.EXE|IEFrame|Profile - Microsoft Internet Explorer",1)
' THESE LINES ADDED MANUALLY TO VERIFY OPERATION OF SCRIPT
' Display the stock price quote captured from the web page.
MsgBox(tabcell_0$(0), 0, "Price for Stock Ticker Symbol: AES")
Add the MsgBox function if you wish to test the script. Save the script using the name "C:\quote_AES.src". Compile and execute the script. Observe the price of the stock as the script executes. The price displayed in the message box will match the value observed on the web page.
Update the script to use a string variable for the Stock Ticker Symbol
The next step in developing a Data Driven Web Automation script is to replace all instances of the hard coded stock ticker symbol AES with a string variable. We also take this opportunity to change the name of the string array to be more descriptive of its contents. All changes from the previous script are shown in RED.
' Allocate a string array to hold the data captured from the web page.
' NOTE #1: The array can store 5 elements indexed from 0 to 4.
' NOTE #2: The capacity of the array can be as small as 1.
Dim CapturedData$(5)
' Start Internet Explorer and load the web page.
StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")
' Define a string variable to store the Stock Ticker Symbol
TickerSymbol$ = "AES"
' Wait for the www.marketwatch.com stock quotes web page to load.
UsePage("Profile")
' The next two statements enter the stock ticker symbol into the text
' field and clicks the arrow button respectively requesting the price
' of the stock.
WriteHTML("INPUT TEXT[NAME= 'symb']", TickerSymbol$)
ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")
' Wait for the page to be updated with the stock price quote.
UsePage("Profile")
' Capture the stock price quote from the table on the web page.
' A single cell from the table is captured and written into the
' first element of the temporary string array.
ret = CaptureTableHTML("TABLE[CONTENT='"+TickerSymbol$+"']", "R3C2:R3C2", CapturedData$())
' Close Internet Explorer
CloseWindow("IEXPLORE.EXE|IEFrame|Profile - Microsoft Internet Explorer",1)
' THESE LINES ADDED MANUALLY TO VERIFY OPERATION OF SCRIPT
' Display the stock price quote captured from the web page.
MsgBox(CapturedData$(0), 0, "Price for Stock Ticker Symbol: "+TickerSymbol$)
Most of the changes to the script are straightforward. Of special note however is the change made to the CaptureTableHTML function. The HTML Descriptor has been modified from a simple string to three strings concatenated together with the "+" operator. Be careful not to drop the single quotes from the resultant string or the script will not replay correctly.
Save the script using the name "C:\quote_variable.src". Compile and execute the script. This version of the script exhibits the same behavior as the first script. The message box displayed at the end of the script will report the same results for AES as the first script.
Update the script to read the Stock Ticker Symbols from a spreadsheet
This step will read a number of stock ticker symbols from an Excel spreadsheet and look up the stock price quote for the first stock ticker symbol listed in the spreadsheet. All changes from the previous script are shown in RED.
' Allocate a string array to hold the data captured from the web page.
' NOTE #1: The array can store 5 elements indexed from 0 to 4.
' NOTE #2: The capacity of the array can be as small as 1.
Dim CapturedData$(5)
' Allocate an array of 100 strings to store the stock ticker symbols.
' NOTE: The array can store 100 elements indexed from 0 to 99.
Dim StockTickerSymbols$(100)
' Define a variable to index into both string arrays.
StockIndex = 0
' Define string variables relating to the Excel spreadsheet file
ExcelSpreadsheet$ = "C:\DailyStockQuotes.xls"
CellsForSymbols$ = "A1:A100" ' Maximum of 100 stock ticker symbols
' Read the list of stock ticker symbols from the spreadsheet.
' The data will be read into the symbols string array.
ReadExcel(ExcelSpreadsheet$, CellsForSymbols$, StockTickerSymbols$())
' Start Internet Explorer and load the web page.
StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")
' Define a string variable to hold the current stock ticker symbol
TickerSymbol$ = StockTickerSymbols$(StockIndex)
' Wait for the www.marketwatch.com stock quotes web page to load.
UsePage("Profile")
' The next two statements enter the stock ticker symbol into the text
' field and clicks the arrow button respectively requesting the price
' of the stock.
WriteHTML("INPUT TEXT[NAME= 'symb']", TickerSymbol$)
ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")
' Wait for the page to be updated with the stock price quote.
UsePage("Profile")
' Capture the stock price quote from the table on the web page.
' A single cell from the table is captured and written into the
' first element of the temporary string array.
ret = CaptureTableHTML("TABLE[CONTENT='"+TickerSymbol$+"']", "R3C2:R3C2", CapturedData$())
' Close Internet Explorer
CloseWindow("IEXPLORE.EXE|IEFrame|Profile - Microsoft Internet Explorer",1)
' THESE LINES ADDED MANUALLY TO VERIFY OPERATION OF SCRIPT
' Display the stock price quote captured from the web page.
MsgBox(CapturedData$(0), 0, "Price for Stock Ticker Symbol: "+TickerSymbol$)
' Display the second and third symbols read from the spreadsheet.
MsgBox(StockTickerSymbols$(1), 0, "Second spreadsheet Stock Ticker Symbol")
MsgBox(StockTickerSymbols$(2), 0, "Third spreadsheet Stock Ticker Symbol")
Save the script using the name "C:\quote_read_excel.src". Open Excel and type the stock ticker symbols AES, ANF, and ALB into cells A1 through A3 inclusive. Save the spreadsheet using the name "C:\DailyStockQuotes.xls".
Return to the WinTask editor. Compile and execute the script. The first message box displayed at the end of the script will report the same results for AES as the two previous scripts in this topic. The succeeding message boxes will report the results for ANF and ALB respectively.
Update the script to add iteration
Now that we have a list of stock ticker symbols in memory, its time to add iteration to the script. Iteration will allow the script to read the stock prices quotes for more symbols than just the single symbol of the previous scripts. All changes from the previous script are shown in RED.
' Allocate a string array to hold the data captured from the web page.
' NOTE #1: The array can store 5 elements indexed from 0 to 4.
' NOTE #2: The capacity of the array can be as small as 1.
Dim CapturedData$(5)
' Allocate two arrays of 100 strings to store the stock ticker
' symbols and the stock price quotes respectively.
' NOTE: Both arrays can store 100 elements indexed from 0 to 99.
Dim StockTickerSymbols$(100)
Dim StockPriceQuotes$(100)
' Define the maximum index for the string arrays.
MAX_ARRAY_INDEX = 100
' Define a variable to index into both string arrays.
StockIndex = 0
' Define string variables relating to the Excel spreadsheet file
ExcelSpreadsheet$ = "C:\DailyStockQuotes.xls"
CellsForSymbols$ = "A1:A100" ' Maximum of 100 stock ticker symbols
' Read the list of stock ticker symbols from the spreadsheet.
' The data will be read into the symbols string array.
ReadExcel(ExcelSpreadsheet$, CellsForSymbols$, StockTickerSymbols$())
' Start Internet Explorer and load the web page.
StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")
' Define a string variable to hold the current stock ticker symbol
TickerSymbol$ = StockTickerSymbols$(StockIndex)
' Capture stock price quotes as long we have a stock ticker symbol
While (TickerSymbol$ <> "")
' Wait for the www.marketwatch.com stock quotes web page to load.
UsePage("Profile")
' The next two statements enter the stock ticker symbol
' into the text field and clicks the arrow button respectively
' requesting the price of the stock.
WriteHTML("INPUT TEXT[NAME= 'symb']", TickerSymbol$)
ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")
' Wait for the page to be updated with the stock price quote.
UsePage("Profile")
' Capture the stock price quote from the table on the web page.
' A single cell from the table is captured and written into the
' first element of the temporary string array.
ret = CaptureTableHTML("TABLE[CONTENT='"+TickerSymbol$+"']", "R3C2:R3C2", CapturedData$())
' Copy the captured data into the stock price quotes array
StockPriceQuotes$(StockIndex) = CapturedData$(0)
' Get the next stock ticker symbol from the array
StockIndex = StockIndex + 1
If (StockIndex < MAX_ARRAY_INDEX) Then
TickerSymbol$ = StockTickerSymbols$(StockIndex)
Else
TickerSymbol$ = "" ' Reached end of array
EndIf
Wend
' Close Internet Explorer
CloseWindow("IEXPLORE.EXE|IEFrame|Profile - Microsoft Internet Explorer",1)
' THESE LINES ADDED MANUALLY TO VERIFY OPERATION OF SCRIPT
' Display the stock price quotes captured from the web page.
StockIndex = 0
While (StockIndex < 3)
MsgBox(StockPriceQuotes$(StockIndex), 0, "Price for Stock Ticker Symbol: "+StockTickerSymbols$(StockIndex))
StockIndex = StockIndex +1
Wend
Save the script using the name "C:\quote_iteration.src". Compile and execute the script. This script uses the Excel spreadsheet created for the previous script, "C:\DailyStockQuotes.xls".
A total of three message boxes will be displayed. The first will report the results for AES, the second for ANF and the third for ALB. The message boxes should be identical to the previous script in this topic.
Update the script to store the price quotes in the spreadsheet
The final change to the script is to store the accumulated stock price quotes into the Excel spreadsheet. All changes from the previous script are shown in RED.
' Allocate a string array to hold the data captured from the web page.
' NOTE #1: The array can store 5 elements indexed from 0 to 4.
' NOTE #2: The capacity of the array can be as small as 1.
Dim CapturedData$(5)
' Allocate two arrays of 100 strings to store the stock ticker
' symbols and the stock price quotes respectively.
' NOTE: Both arrays can store 100 elements indexed from 0 to 99.
Dim StockTickerSymbols$(100)
Dim StockPriceQuotes$(100)
' Define the maximum index for the string arrays.
MAX_ARRAY_INDEX = 100
' Define a variable to index into both string arrays.
StockIndex = 0
' Define string variables relating to the Excel spreadsheet file
ExcelSpreadsheet$ = "C:\DailyStockQuotes.xls"
CellsForSymbols$ = "A1:A100" ' Maximum of 100 stock ticker symbols
CellsForQuotes$ = "B1:B100" ' Maximum of 100 stock price quotes
' Read the list of stock ticker symbols from the spreadsheet.
' The data will be read into the symbols string array.
ReadExcel(ExcelSpreadsheet$, CellsForSymbols$, StockTickerSymbols$())
' Start Internet Explorer and load the web page.
StartBrowser("IE", "www.marketwatch.com/tools/quotes/profile.asp")
' Define a string variable to hold the current stock ticker symbol
TickerSymbol$ = StockTickerSymbols$(StockIndex)
' Capture stock price quotes as long we have a stock ticker symbol
While (TickerSymbol$ <> "")
' Wait for the www.marketwatch.com stock quotes web page to load.
UsePage("Profile")
' The next two statements enter the stock ticker symbol
' into the text field and clicks the arrow button respectively
' requesting the price of the stock.
WriteHTML("INPUT TEXT[NAME= 'symb']", TickerSymbol$)
ClickHTMLElement("INPUT IMAGE[SRC= 'http://www.marketwat']")
' Wait for the page to be updated with the stock price quote.
UsePage("Profile")
' Capture the stock price quote from the table on the web page.
' A single cell from the table is captured and written into the
' first element of the temporary string array.
ret = CaptureTableHTML("TABLE[CONTENT='"+TickerSymbol$+"']", "R3C2:R3C2", CapturedData$())
' Copy the captured data into the stock price quotes array
StockPriceQuotes$(StockIndex) = CapturedData$(0)
' Get the next stock ticker symbol from the array
StockIndex = StockIndex + 1
If (StockIndex < MAX_ARRAY_INDEX) Then
TickerSymbol$ = StockTickerSymbols$(StockIndex)
Else
TickerSymbol$ = "" ' Reached end of array
EndIf
Wend
' Close Internet Explorer
CloseWindow("IEXPLORE.EXE|IEFrame|Profile - Microsoft Internet Explorer",1)
' Write the captured stock prices quotes into the spreadsheet
WriteExcel(ExcelSpreadsheet$, CellsForQuotes$, StockPriceQuotes$())
Save the script using the name "C:\quote_data_driven.src". Compile and execute the script. This script uses the Excel spreadsheet created for the previous scripts, "C:\DailyStockQuotes.xls".
The captured stock price quotes are now saved in the Excel spreadsheet. When the spreadsheet is inspected, the price quote of each stock is listed in column B next to its respective stock ticker symbol.