Wazup 1.1
About:
Wazup is a module that parses web-pages, extacts usefull information from the web-garbage and displays it using any Label module (Label/xLabel/xLabelLight). Also you can write extracted info to the file and use it by Oborzevatel module to have "true HTML".
System Requiments:
There are no any specific software or hardware limitations. I hope...
Using:
Each web-page query must be captioned in the next manner:
*Wazup [name]
All query settings have the following form
[name][option] [value]
Example:
*Wazup MySiteNews
MySiteNewsURL http://www.mysite.ru/index.html
Sample configuration using xLabelLight 2.8.3:
;===============================================
; Web-page monitoring (Wazup v1.0)
;-----------------------------------------------
; This configuration isn't practically usefull for people
; who don't know Russian, but it was alone page available
; locally on my machine :)
*Wazup LSAtRu
; NewsItem describes a pattern for extracting information about
; one news item. This little trick would help us to avoid
; silly copying
NewsItem "news_top{%quote}>{%}</td>{*}right{%quote}>{%}</div></td>{*}news_post{%quote}>{%}<div align={%quote}right"
LSAtRuURL http://www.litestep.bip.ru/
LSAtRuUpdateInterval 10
LSAtRuInputString "$NewsItem${*}$NewsItem${*}$NewsItem${*}$NewsItem${*}$NewsItem$"
LSAtRuOutputString "LiteStep@Russia News\n\n{%1}\n{%2}\n{%3}\n\n{%4}\n{%5}\n{%6}\n\n{%7}\n{%8}\n{%9}\n\n{%10}\n{%11}\n{%12}\n\n{%13}\n{%14}\n{%15}"
LSAtRuEnabled true
LSAtRuOnChecked !Execute [!WazupSetUpdateInterval LSAtRu 900][!LabelShow NewsLabel]
LSAtRuOnFailure !Execute [!WazupSetUpdateInterval LSAtRu 10]
LSAtRuOnUpdated !alert "Someting new!"
LSAtRuLocalFile "$MiscDir$lsnews.html"
LSAtRuDisplayOn NewsLabel
;======================
*Label NewsLabel
;----------------------
NewsLabelX 100
NewsLabelY 100
NewsLabelWidth 300
NewsLabelHeight 600
NewsLabelText "Here would be the dragons"
NewsLabelImage lsnews.png
NewsLabelImageMode stretch
NewsLabelImageTopEdge 3
NewsLabelImageBottomEdge 3
NewsLabelLeftBorder 5
NewsLabelRightBorder 5
NewsLabelTopBorder 10
NewsLabelBottomBorder 5
NewsLabelAutoLineBreak true
NewsLabelAlign left
NewsLabelVertAlign top
NewsLabelStartHidden
NewsLabelScroll vertical-up
Configuration:
Any query setting must begin from the query name, specified in the
*Wazup line.
Patameters:
(query-name)URL [web-page]
Web page URL. You cannot use local files as URL!
Default: http://www.shellfront.org/
(query-name)LocalFile [file]
If set, this file would be used for tracking changes of the web-page. If you skipped this option then temporary file would be used and OnUpdated action would be disabled
Default: empty string, tracking disabled
(query-name)Enabled [false/true]
If enabled, news would be automatically updated. Else you need to use !WazupCheck bang manually
Default: true
(query-name)UpdateInterval [number]
Time interval between checking (in seconds!).
Default : 600 (10 minutes)
(query-name)InputString [pattern]
This string defines a page pattern which would be used to determine placement of the target information on the page.
The module compares the pattern with a downloaded page and extracts usefull substrings from there.
Count of the extracted text strings started from 1.
For example, if target page looks like this:
<html>
<body>
21.04.2004<br>
Seg@<br>
Wazzzzzup!
</body>
</html>
and the pattern is the following:
<body>{%}<br>{%}<br>{%}</body>
wazup.dll would extract 3 substrings from the page:
1 - 21.04.2004
2 - Seg@
3 - Wazzzzzup!
More detailed information about how to write pattern available here
Default: {%}
(query-name)OutputString [pattern]
This string defines a form of the output, e.g. transformed text. It is just a text where strings to be extracted are replaced with {%N} (N is the number of an extracted string).
For example, if parsing of the web-page give us the following set of strings:
1 - 21.04.2004
2 - Seg@
3 - Wazzzzzup!
then the following pattern:
Post date: {%1}\nNews Maker: {%2}\n{%3}
let us to show in Label "right" formatted text:
Post date: 21.04.2004
News Maker: Seg@
Wazzzzzup!
Default: {%1}
(query-name)NewLine [string]
<br> and <p> HTML tags replaced with this string when text displayed on label. Use this setting if you want to use single line label with scrolling.
(query-name)OnChecked [action]
Action performed after successfull downloading and parsing of the web page.
Default: !none
(query-name)OnFailure [action]
Action performed if the downloading or the page parsing failed.
Default: !none
(query-name)OnUpdated [action]
Action performed if the page changed after last checking. Required LocalFile to be set to work.
Default: !none
(query-name)DisplayOn [label name]
If set and not empty then this label would be used to display output string.
Default: empty
(query-name)OutputFile [file path]
If set and not empty then output string would be written to this file after each successfull web-page checking
You may use this option, for instance, if you want to display "true HTML" with Oborzevatel module. Wazup.dll doesn't
remove HTML tags before writing a file, although for Label output does.
Default: empty
!Bangs:
First parameter of each !bang is query name.
For example, to read a page described by MySiteNews query you need to type the following thing:
!WazupCheck MySiteNews
Full list of available !bangs:
!WazupCheck (query-name)
Check web page just now.
!WazupEnable (query-name)
Enable autoupdating of the page.
!WazupDisable (query-name)
Disable autoupdating.
!WazupToggle (query-name)
Toggle autoupdating state.
!WazupSetURL (query-name) [URL]
Change source web-page URL.
!WazupSetInputString (query-name) [pattern]
Change web-page pattern.
!WazupSetOutputString (query-name) [pattern]
Change output format.
!WazupSetUpdateInterval (query-name) [time in seconds]
Change time interval between checking.
Writing a pattern:
Pattern is a regular string where some pieces of text replaced with escape-sequences. Here is a list of escape-sequences you may use in Wazup input-string pattern
{*}
Any text. Use this to skip something long but doesn't matter for you
{%}
Extracted substring. This sequence means that here is usefull information which must to be memorized for future use in output.
{%,N}
Extracted substring consists of N symbols
{%quote}
Doublequotes
Extracted strings are numerated from 1 using the order of extraction.
Simple example of pattern usage.
Let's imagine that we have a page with the following content:
<html>
<body>
We're the champions, my friend!
</body>
<html>
and the following pattern:
MySiteNewsInputString "<body>{%}</body>"
When module parses the page, it skip everything until the first enterance of <body>, then read substring (user specified {%} here) until it would meet </body>. Resulting string would be marked as substring #1.
When output pattern is something like this
MySiteNewsOutputString Msg: {%1}
output string would be the next:
Msg: We're the champions, my friend!
Notes:
Just some important notes:
If InputString contain spaces it should be framed with doublequotes.
If InputString contain doublequotes, you need to replace them with escape-sequence: {%quote}
If you want to extract news body, you should do this like in the sample config-file: use $eVar$ to define separate news item. It would save your time and make RC more readable :)
Don't forget that maximal length of the RC-file line is 4096 characters (with expanded enviroment variables)!!!
That's why sometimes you may use {*} instead of {%quote} - it is shorter
Changelog:
Version 1.1, 24.04.2004 (hmmm, the last release wasn't final :) )
Fixed broken support of multiple text items Label feature (s1;s2;s3). Now you can use semicolons in OutputString to split news
BUT: now it is neccessery to use quotes if OutputString contain spaces, else it would be cut at the first enterance of space character!
Also don't forget to put quotes if you divide label text into multiple items since ';' is a comment character too
Fixed a typo in documentation, mentioned by Eddie Hung.
If LocalFile set and !WazupSetOutputString called, label text would be changed automatically
Also you may use this feature as a trick to load text from file on startup
Added NewLine option to be able to output some HTML files to single line labels
Version 1.0, 21.04.2004
Initial release... and final, I really hope :)
Author:
Handle :
Sergey Gagarin a.k.a. Seg@
E-Mail :
inform-sega@freemail.ru
Web :
http://www.litestep.bip.ru/
ICQ : 162261148
IRC : #litestep @ freenode.net