Automation Action: HTTP Get
Read any web page or web API and assign the returned content to a variable.
Reads a http resource using HTTP GET or a local file path and assigns the returned HTML to a variable.
Use this Action to read any http resource (web page) or a local html file. Specify the URL Or File Path To Get (including any query string).
If the web resource requires authentication then specify the Authentication method and optionally a User Name/Password or an OAuth Auth Token retrieved from a previous OAuth SignIn action. Does not apply if reading from a local file path. You can also use Amazon AWS Signed Request authentication for signing AWS requests.
Optionally specify any Query String Parameters and Custom Headers to add to the request. Query string parameters can either be specified in the URL itself or in the Query String Parameters grid. If you specify query string parameters in the grid any %variable% replacements will be automatically URL encoded.
You can also optionally specify to Add To Local Cache. If this option is enabled, ThinkAutomation will maintain a local cached copy of the content. You can specify the number of Minutes the content should remain cached. If the same URL is requested again within this period it will be read from the cache.
You can specify the Connection Timeout (in seconds). This is the number of seconds to wait for the initial connection. The Response Timeout is the number of seconds to wait for a response after the connection has been made.
The Convert Returned Content To option enables you to convert the http response content. Options are:
- Nothing : The response is returned as is.
- Convert HTML To Plain Text : Removes all HTML tags and returns only readable text.
- Convert HTML To Markdown : Converts the HTML to Markdown text. Images will be removed. The tags
<nav>
and<footer>
will also be removed before conversion. If you need finer control over HTML to Markdown conversion, leave the HTML as is and use the Text Operation action. - Convert HTML To XML : Converts the HTML to well-formed XML allowing easier parsing.
- Convert HTML To XML (Drop Formatting) : Converts the HTML to XML and drops all formatting tags, styles, images, scripts etc. This allows easier parsing of specific text elements. See: HTML To XML.
- Convert HTML To Json (Drop Formatting) : Converts HTML To Json and drops all formatting tags, styles, images, scripts etc. This allows easier parsing of specific text elements. See: HTML To Json.
- Convert XML To Json : Converts XML to Json. Useful if the HTTP response is XML format and you need to work with Json.
- Convert CSS To Inline Styles : Moves all CSS styles sheets to inline style attributes. This enables the HTML to be sent via email as most email clients only support inline styles.
- Convert Relative Links To Absolute Links : Converts all relative links to absolute links. For example, if requesting a URL from http://www.mysite.com:
<img src="image.png">
becomes<img src="http://www.mysite.com/image.png">
- Convert CSS To Inline And Relative Links To Absolute : Performs both above operations.
The returned content can then be assigned to a variable. Select from the Assign Content To list. You can then make use of the returned content in subsequent Actions.
You can optionally assign any HTML <title>
tag to a variable. Select from the Assign Title To list. If the returned content is not HTML or has no <title>
tag the variable will be set to blank.
You can optionally assign any <meta description='xxx'>
description tag to a variable. Select from the Assign Description To list. If the returned content is not HTML or has no description tag then the variable will be set to blank.
Response Status
The HTTP response status code & response headers can also optionally be assigned to variables.
The status code will be the HTTP response status (200, 404 etc). A status code of <100 indicates a connection error (eg: 2 = 'DNS lookup failed', 3 = 'DNS lookup timeout', 6 = 'Connect timeout'). The error details will be added to the log.
If the Throw Error On HTTP Errors option is enabled then the Automation will log an error if the HTTP status is an error status (404, 500 etc). If this option is not enabled then an error will not be raised (the status will still be logged). This is useful if the purpose of your Automation is to check for HTTP errors. Note: Connection errors will always throw an error.