Wazup 1.0
About:
Wazup is a module that parses web-pages, extacts usefull information from the web-garbage and displays it using any Label module (Label/xLabel/xLabelLight). Also you can write extracted info to the file and use it by Oborzevatel module to have "true HTML".
System Requiments:
There are no any specific software or hardware limitations. I hope...
Using:
Each web-page query must be captioned in the next manner:
*Wazup [name]

All query settings have the following form
[name][option] [value]
Example:
*Wazup MySiteNews
MySiteNewsURL http://www.mysite.ru/index.html

up
Sample configuration using xLabelLight 2.8.3:
;===============================================
; Web-page monitoring (Wazup v1.0)
;-----------------------------------------------

; This configuration isn't practically usefull for people
; who don't know Russian, but it was alone page available
; locally on my machine :)

*Wazup LSAtRu

; NewsItem describes a pattern for extracting information about
; one news item. This little trick would help us to avoid
; silly copying

NewsItem "news_top{%quote}>{%}</td>{*}right{%quote}>{%}</div></td>{*}news_post{%quote}>{%}<div align={%quote}right"

LSAtRuURL http://www.litestep.bip.ru/
LSAtRuUpdateInterval 10
LSAtRuInputString "$NewsItem${*}$NewsItem${*}$NewsItem${*}$NewsItem${*}$NewsItem$"

LSAtRuOutputString LiteStep@Russia News\n\n{%1}\n{%2}\n{%3}\n\n{%4}\n{%5}\n{%6}\n\n{%7}\n{%8}\n{%9}\n\n{%10}\n{%11}\n{%12}\n\n{%13}\n{%14}\n{%15}\n\n

LSAtRuEnabled true

LSAtRuOnChecked !Execute [!WazupSetUpdateInterval LSAtRu 900][!LabelShow NewsLabel]
LSAtRuOnFailure !Execute [!WazupSetUpdateInterval LSAtRu 10]

LSAtRuOnUpdated !alert "Someting new!"

LSAtRuLocalFile "$MiscDir$lsnews.html"
LSAtRuDisplayOn NewsLabel

;======================
*Label NewsLabel
;----------------------
NewsLabelX 100
NewsLabelY 100
NewsLabelWidth 300
NewsLabelHeight 600
NewsLabelText "Here would be the dragons"
NewsLabelImage lsnews.png
NewsLabelImageMode stretch
NewsLabelImageTopEdge 3
NewsLabelImageBottomEdge 3
NewsLabelLeftBorder 5
NewsLabelRightBorder 5
NewsLabelTopBorder 10
NewsLabelBottomBorder 5
NewsLabelAutoLineBreak true
NewsLabelAlign left
NewsLabelVertAlign top
NewsLabelStartHidden
NewsLabelScroll vertical-up
up
Configuration:
Any web-query setting must begin from the query-name, specified in the *Wazup line.

Patameters:

  • (query-name)URL [web-page]
    Web page URL. You cannot use local files as URL!
    Default: http://www.shellfront.org/
  • (query-name)LocalFile [file]
    If set, this file would be used for tracking changes of the web-page. If you skipped this option then temporary file would be used and OnUpdated action would be disabled
    Default: empty string, tracking disabled
  • (query-name)Enabled [false/true]
    If enabled, news would be automatically updated. Else you need to use !WazupCheck bang manually
    Default: true
  • (query-name)UpdateInterval [number]
    Time interval between checking (in seconds!).
    Default : 600 (10 minutes)
  • (query-name)InputString [pattern]
    This string defines a page pattern which would be used to determine placement of the target information on the page.
    The module compares the pattern with a downloaded page and extracts usefull substrings from there.
    Count of the extracted text strings started from 1.
    For example, if target page looks like this:

    <html>
      <body>
        21.04.2004<br>
        Seg@<br>
        Wazzzzzup!
      </body>
    </html>
    and the pattern is the following:

    <body>{%}<br>{%}<br>{%}</body>
    wazup.dll would extract 3 substrings from the page:
      1 - 21.04.2004
      2 - Seg@
      3 - Wazzzzzup!
    More detailed information about how to write pattern available here
    Default: {%}
  • (query-name)OutputString [pattern]
    This string defines a form of the output, e.g. transformed text. It is just a text where strings to be extracted are replaced with {%N} (N is the number of an extracted string).
    For example, if parsing of the web-page give us the following set of strings:
      1 - 21.04.2004
      2 - Seg@
      3 - Wazzzzzup!
    then the following pattern:

    Post date: {%1}\nNews Maker: {%2}\n{%3}
    let us to show in Label "right" formatted text:

    Post date: 21.04.2004
    News Maker: Seg@
    Wazzzzzup!

    Default: {%1}
  • (query-name)OnChecked [action]
    Action performed after successfull downloading and parsing of the web page.
    Default: !none
  • (query-name)OnFailure [action]
    Action performed if the downloading or the page parsing failed.
    Default: !none
  • (query-name)OnUpdated [action]
    Action performed if the page changed after last checking. Required LocalFile to be set to work.
    Default: !none
  • (query-name)OutputLabel [label name]
    If set and not empty then this label would be used to display output string.
    Default: empty
  • (query-name)OutputFile [file path]
    If set and not empty then output string would be written to this file after each successfull web-page checking
    You may use this option, for instance, if you want to display "true HTML" with Oborzevatel module. Wazup.dll doesn't remove HTML tags before writing a file, although for Label output does.
    Default: empty
  • up
    !Bangs:
    First parameter of each !bang is query name.
    For example, to read a page described by MySiteNews query you need to type the following thing:
    !WazupCheck MySiteNews

    Full list of available !bangs:

  • !WazupCheck (query-name)
    Check web page just now.
  • !WazupEnable (query-name)
    Enable autoupdating of the page.
  • !WazupDisable (query-name)
    Disable autoupdating.
  • !WazupToggle (query-name)
    Toggle autoupdating state.
  • !WazupSetURL (query-name) [URL]
    Change source web-page URL.
  • !WazupSetInputString (query-name) [pattern]
    Change web-page pattern.
  • !WazupSetOutputString (query-name) [pattern]
    Change output format.
  • !WazupSetUpdateInterval (query-name) [time in seconds]
    Change time interval between checking.
  • up
    Writing a pattern:
    Pattern is a regular string where some pieces of text replaced with escape-sequences. Here is a list of escape-sequences you may use in Wazup input-string pattern
  • {*}
    Any text. Use this to skip something long but doesn't matter for you
  • {%}
    Extracted substring. This sequence means that here is usefull information which must to be memorized for future use in output.
  • {%,N}
    Extracted substring consists of N symbols
  • {%quote}
    Doublequotes
  • Extracted strings are numerated from 1 using the order of extraction.

    Simple example of pattern usage.

    Let's imagine that we have a page with the following content:
    <html>
      <body>
        We're the champions, my friend!
      </body>
    <html>
    and the following pattern:
    MySiteNewsInputString "<body>{%}</body>"
    When module parses the page, it skip everything until the first enterance of <body>, then read substring (user specified {%} here) until it would meet </body>. Resulting string would be marked as substring #1.
    When output pattern is something like this
    MySiteNewsOutputString Msg: {%1}
    output string would be the next:
    Msg: We're the champions, my friend!
    up
    Notes:
    Just some important notes:
  • If InputString contain spaces it should be framed with doublequotes.
  • If InputString contain doublequotes, you need to replace them with escape-sequence: {%quote}
  • If you want to extract news body, you should do this like in the sample config-file: use $eVar$ to define separate news item. It would save your time and make RC more readable :)
  • Don't forget that maximal length of the RC-file line is 4096 characters (with expanded enviroment variables)!!!
  • That's why sometimes you may use {*} instead of {%quote} - it is shorter
  • up
    Changelog:
    Version 1.0, 21.04.2004
  • Initial release... and final, I really hope :)
  • up
    Author:
    Handle : Sergey Gagarin a.k.a. Seg@
    E-Mail : inform-sega@freemail.ru
    Web : http://www.litestep.bip.ru/
    ICQ : 162261148
    IRC : #litestep @ freenode.net
    up