Files
PowerToys/PythonHome/Lib/HTMLParser.pyc

80 lines
14 KiB
Plaintext
Raw Normal View History

2014-07-09 18:15:23 +08:00
<03>
<EFBFBD>W`Sc@sdZddlZddlZejd<00>Zejd<00>Zejd<00>Zejd<00>Zejd<00>Zejd<00>Z ejd <00>Z
ejd
<00>Z ejd <00>Z ejd <00>Z ejd ej<00>Zejd<00>Zejd<00>Zdefd<00><00>YZdejfd<00><00>YZdS(sA parser for HTML and XHTML.i<><69><EFBFBD><EFBFBD>Ns[&<]s
&[a-zA-Z#]s%&([a-zA-Z][-.a-zA-Z0-9]*)[^a-zA-Z0-9]s)&#(?:[0-9]+|[xX][0-9a-fA-F]+)[^0-9a-fA-F]s <[a-zA-Z]t>s--\s*>s$([a-zA-Z][^
/>]*)(?:\s|/(?!>))*s[a-zA-Z][^
/>]*s]((?<=[\'"\s/])[^\s/>][^\s/=>]*)(\s*=+\s*(\'[^\']*\'|"[^"]*"|(?![\'"])[^>\s]*))?(?:\s|/(?!>))*s
<[a-zA-Z][^\t\n\r\f />\x00]* # tag name
(?:[\s/]* # optional whitespace before attribute name
(?:(?<=['"\s/])[^\s/>][^\s/=>]* # attribute name
(?:\s*=+\s* # value indicator
(?:'[^']*' # LITA-enclosed value
|"[^"]*" # LIT-enclosed value
|(?!['"])[^>\s]* # bare value
)
)?(?:\s|/(?!>))*
)*
)?
\s* # trailing whitespace
2014-07-10 23:57:08 +08:00
s#</\s*([a-zA-Z][-.a-zA-Z0-9:_]*)\s*>tHTMLParseErrorcBs#eZdZdd<00>Zd<00>ZRS(s&Exception raised for all parse errors.cCs3|s t<00>||_|d|_|d|_dS(Nii(tAssertionErrortmsgtlinenotoffset(tselfRtposition((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyt__init__<s   cCsW|j}|jdk r,|d|j}n|jdk rS|d|jd}n|S(Ns , at line %ds , column %di(RRtNoneR(Rtresult((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyt__str__Bs  N(NN(t__name__t
__module__t__doc__R RR (((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR9s t
2014-07-09 18:15:23 +08:00
HTMLParsercBs eZdZdZd<00>Zd<00>Zd<00>Zd<00>Zd<00>ZdZ
d<00>Z d <00>Z d
<00>Z d <00>Zd <00>Zd d<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>ZdZd<00>Z RS( s<>Find tags and other markup and call handler functions.
Usage:
p = HTMLParser()
p.feed(data)
...
p.close()
Start tags are handled by calling self.handle_starttag() or
self.handle_startendtag(); end tags by self.handle_endtag(). The
data between tags is passed from the parser to the derived class
by calling self.handle_data() with the data as argument (the data
may be split up in arbitrary chunks). Entity references are
passed by calling self.handle_entityref() with the entity
reference as the argument. Numeric character references are
passed to self.handle_charref() with the string containing the
reference as the argument.
2014-07-10 23:57:08 +08:00
tscripttstylecCs|j<00>dS(s#Initialize and reset this instance.N(treset(R((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRbscCs8d|_d|_t|_d|_tjj|<00>dS(s1Reset this instance. Loses all unprocessed data.ts???N( trawdatatlasttagtinteresting_normalt interestingR t
2014-07-09 18:15:23 +08:00
cdata_elemt
markupbaset
2014-07-10 23:57:08 +08:00
ParserBaseR(R((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRfs
2014-07-09 18:15:23 +08:00
    cCs!|j||_|jd<00>dS(s<>Feed data to the parser.
Call this as often as you want, with as little or as much text
as you want (may include '\n').
2014-07-10 23:57:08 +08:00
iN(Rtgoahead(Rtdata((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pytfeednscCs|jd<00>dS(sHandle any buffered data.iN(R(R((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pytclosewscCst||j<00><00><00>dS(N(Rtgetpos(Rtmessage((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyterror{scCs|jS(s)Return full source of start tag: '<...>'.(t_HTMLParser__starttag_text(R((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pytget_starttag_text<78>scCs2|j<00>|_tjd|jtj<00>|_dS(Ns </\s*%s\s*>(tlowerRtretcompiletIR(Rtelem((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pytset_cdata_mode<64>scCst|_d|_dS(N(RRR R(R((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pytclear_cdata_mode<64>s c
Cs||j}d}t|<00>}x||kr%|jj||<00>}|rT|j<00>}n|jraPn|}||kr<>|j|||!<21>n|j||<00>}||kr<>Pn|j}|d|<00>r7t j
2014-07-09 18:15:23 +08:00
||<00>r<>|j |<00>}n<>|d|<00>r |j |<00>}n<>|d|<00>r*|j |<00>}nm|d|<00>rK|j|<00>}nL|d|<00>rl|j|<00>}n+|d|kr<>|jd<00>|d}nP|dkr"|s<>Pn|jd|d<17>}|dkr|jd|d<17>}|dkr |d}q n
|d7}|j|||!<21>n|j||<00>}q|d |<00>rtj
||<00>}|r<>|j<00>d
d !} |j| <00>|j<00>}|d |d<18>s<>|d}n|j||<00>}qq"d ||kr|j|||d
!<21>|j||d
<17>}nPq|d |<00>rtj
||<00>}|r<>|jd<00>} |j| <00>|j<00>}|d |d<18>sv|d}n|j||<00>}qntj
||<00>}|r<>|r<>|j<00>||kr<>|jd<00>nPq"|d|kr |jd <00>|j||d<17>}q"Pqdstd<00><00>qW|rk||krk|j rk|j|||!<21>|j||<00>}n|||_dS(Nit<s</s<!--s<?s<!iRs&#ii<><69><EFBFBD><EFBFBD>t;t&s#EOF in middle of entity or char refsinteresting.search() lied(RtlenRtsearchtstartRt handle_datat updatepost
startswitht starttagopentmatchtparse_starttagt parse_endtagt parse_commenttparse_pitparse_html_declarationtfindtcharreftgroupthandle_charreftendt entityrefthandle_entityreft
incompleteR!R(
2014-07-10 23:57:08 +08:00
RR?RtitnR5tjR3tktname((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR<00>s<>           
2014-07-09 18:15:23 +08:00
       cCs<>|j}|||d!dkr0|jd<00>n|||d!dkrT|j|<00>S|||d!dkrx|j|<00>S|||d!j<00>d kr<>|jd
2014-07-10 23:57:08 +08:00
|d<17>}|d kr<>d S|j||d|!<21>|d S|j|<00>SdS( Nis<!s+unexpected call to parse_html_declaration()is<!--is<![i s <!doctypeRi<><69><EFBFBD><EFBFBD>i(RR!R8tparse_marked_sectionR$R;t handle_decltparse_bogus_comment(RRCRtgtpos((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR:<00>s    icCs|j}|||d!dkr0|jd<00>n|jd|d<17>}|dkrVdS|rw|j||d|!<21>n|dS( Nis<!s</s"unexpected call to parse_comment()Ri<><69><EFBFBD><EFBFBD>i(s<!s</(RR!R;thandle_comment(RRCtreportRtpos((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRJs  cCs<>|j}|||d!dks,td<00><00>tj||d<17>}|sLdS|j<00>}|j||d|!<21>|j<00>}|S(Nis<?sunexpected call to parse_pi()i<><69><EFBFBD><EFBFBD>(RRtpicloseR/R0t handle_piR?(RRCRR5RE((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR9s #  cCs<>d|_|j|<00>}|dkr(|S|j}|||!|_g}tj||d<17>}|sotd<00><00>|j<00>}|jd<00>j <00>|_
2014-07-09 18:15:23 +08:00
}x<>||kr<>t j||<00>}|s<>Pn|jddd<00>\} }
} |
s<>d} nX| d dko| dkns7| d dko2| dknrG| dd!} n| r_|j | <00>} n|j | j <00>| f<00>|j<00>}q<>W|||!j<00>} | d kr+|j<00>\} }d |jkr| |jjd <00>} t|j<00>|jjd <00>}n|t|j<00>}|j|||!<21>|S| jd
<00>rM|j||<00>n/|j||<00>||jkr||j|<00>n|S( Niis#unexpected call to parse_starttag()iis'i<><69><EFBFBD><EFBFBD>t"Rs/>s
2014-07-10 23:57:08 +08:00
(Rs/>(R R"tcheck_for_whole_start_tagRttagfindR5RR?R=R$RtattrfindtunescapetappendtstripRtcountR.trfindR1tendswiththandle_startendtagthandle_starttagtCDATA_CONTENT_ELEMENTSR)(RRCtendposRtattrsR5RFttagtmtattrnametrestt attrvalueR?RR((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR6sR     $$  cCs<>|j}tj||<00>}|r<>|j<00>}|||d!}|dkrR|dS|dkr<>|jd|<00>rx|dS|jd|<00>r<>dS|j||d<17>|jd<00>n|dkr<>dS|d kr<>dS||kr<>|S|dSntd
<00><00>dS( NiRt/s/>ii<><69><EFBFBD><EFBFBD>smalformed empty start tagRs6abcdefghijklmnopqrstuvwxyz=/ABCDEFGHIJKLMNOPQRSTUVWXYZswe should not get here!(RtlocatestarttagendR5R?R3R2R!R(RRCRRaREtnext((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRRNs,        cCs<>|j}|||d!dks,td<00><00>tj||d<17>}|sLdS|j<00>}tj||<00>}|s$|jdk r<>|j |||!<21>|St
2014-07-09 18:15:23 +08:00
j||d<17>}|s<>|||d!dkr<>|dS|j |<00>Sn|j d<00>j <00>}|jd|j<00><00>}|j|<00>|dS|j d<00>j <00>}|jdk rr||jkrr|j |||!<21>|Sn|j|<00>|j<00>|S( Nis</sunexpected call to parse_endtagii<><69><EFBFBD><EFBFBD>is</>R(RRt endendtagR/R?t
2014-07-10 23:57:08 +08:00
endtagfindR5RR R1RSRJR=R$R;t handle_endtagR*(RRCRR5RKt namematchttagnameR(((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR7ns8 #   
cCs!|j||<00>|j|<00>dS(N(R\Rj(RR`R_((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR[<00>scCsdS(N((RR`R_((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR\<00>scCsdS(N((RR`((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRj<00>scCsdS(N((RRG((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR><00>scCsdS(N((RRG((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRA<00>scCsdS(N((RR((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyR1<00>scCsdS(N((RR((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRL<00>scCsdS(N((Rtdecl((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRI<00>scCsdS(N((RR((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRP<00>scCsdS(N((RR((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyt unknown_decl<63>scs2d|kr|S<>fd<00>}tjd||<00>S(NR-cs|j<00>d}yZ|ddkri|d}|dd krSt|dd<00>}n t|<00>}t|<00>SWntk
2014-07-09 18:15:23 +08:00
r<>d|dSXd dl}tjdkr<>id
d 6}t_x0|jj <00>D]\}}t|<00>||<q<>Wny<00>j|SWnt
k
rd |dSXdS(Nit#itxtXis&#R,i<><69><EFBFBD><EFBFBD>u'taposR-(RpRq( tgroupstinttunichrt
ValueErrorthtmlentitydefsRt
2014-07-10 23:57:08 +08:00
entitydefsR tname2codepointt iteritemstKeyError(tstcRwRxRFtv(R(s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pytreplaceEntities<65>s&
     s#&(#?[xX]?(?:[0-9a-fA-F]+|\w{1,8}));(R%tsub(RR|R((Rs?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRU<00>s (sscriptsstyleN(!R R RR]RRRRR!R R"R#R)R*RR:RJR9R6RRR7R[R\RjR>RAR1RLRIRPRnRxRU(((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyRKs<        ^  4 (          (RRR%R&RRBR@R<R4ROt commentcloseRSttagfind_tolerantRTtVERBOSERfRhRit ExceptionRRR(((s?e:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\HTMLParser.pyt<module>s&
2014-07-09 18:15:23 +08:00