Files
PowerToys/PythonHome/Lib/site-packages/bs4/builder/_htmlparser.pyc

55 lines
9.0 KiB
Plaintext
Raw Normal View History

2014-07-10 23:57:08 +08:00
<03>
m<EFBFBD><EFBFBD>Sc
@s<>dZdgZddlmZmZddlZddlZejd \ZZZ edkp<>edkrwedkp<>edko<>edko<>e dkZ
ddl m Z m Z mZmZmZddlmZmZdd lmZmZmZd
Zd efd <00><00>YZdefd <00><00>YZedkr<>edkr<>e
r<>ddlZejd<00>Zee_ejdej<00>Zee_ddl m!Z!m"Z"d<00>Z#d<00>Z$e#e_#e$e_$e%Z
ndS(sCUse the HTMLParser library to parse HTML files that aren't too bad.tHTMLParserTreeBuilderi<72><69><EFBFBD><EFBFBD>(t
HTMLParsertHTMLParseErrorNii(tCDatatCommentt DeclarationtDoctypetProcessingInstruction(tEntitySubstitutiont UnicodeDammit(tHTMLtHTMLTreeBuildertSTRICTs html.parsertBeautifulSoupHTMLParsercBsYeZd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Zd<00>Z d<00>Z
RS( cCs_i}x9|D]1\}}|dkr.d}n|||<d}q W|jj|dd|<00>dS(Nts""(tNonetsoupthandle_starttag(tselftnametattrst attr_dicttkeytvaluet attrvalue((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyR.s  

cCs|jj|<00>dS(N(Rt handle_endtag(RR((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyR:scCs|jj|<00>dS(N(Rt handle_data(Rtdata((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyR=scCs<>|jd<00>r*t|jd<00>d<00>}n6|jd<00>rTt|jd<00>d<00>}n t|<00>}yt|<00>}Wnttfk
r<>}d}nX|j|<00>dS(NtxitXu<00>(t
startswithtinttlstriptunichrt
ValueErrort OverflowErrorR(RRt real_nameRte((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pythandle_charref@s 
cCsBtjj|<00>}|dk r'|}n
d|}|j|<00>dS(Ns&%s;(RtHTML_ENTITY_TO_CHARACTERtgetRR(RRt characterR((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pythandle_entityrefQs
  
cCs1|jj<00>|jj|<00>|jjt<00>dS(N(RtendDataRR(RR((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pythandle_commentYs cCsh|jj<00>|jd<00>r/|td<00>}n|dkrDd}n|jj|<00>|jjt<00>dS(NsDOCTYPE tDOCTYPER(RR+RtlenRR(RR((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyt handle_decl^s   cCse|j<00>jd<00>r.t}|td<00>}nt}|jj<00>|jj|<00>|jj|<00>dS(NsCDATA[(tupperRRR.RRR+R(RRtcls((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyt unknown_declhs cCsb|jj<00>|jd<00>r>|j<00>jd<00>r>|d }n|jj|<00>|jjt<00>dS(Nt?txmli<6C><69><EFBFBD><EFBFBD>(RR+tendswithtlowerRRR(RR((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyt handle_pirs
 $ ( t__name__t
__module__RRRR&R*R,R/R2R7(((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyR -s     
cBs>eZeZeeegZd<00>Zddd<00>Z
d<00>Z RS(cOs&trt|d<n||f|_dS(Ntstrict(tCONSTRUCTOR_TAKES_STRICTtFalset parser_args(Rtargstkwargs((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyt__init__<5F>s ccsft|t<00>r$|ddtfVdS||g}t||dt<00>}|j|j|j|j fVdS(s<>
:return: A 4-tuple (markup, original encoding, encoding
declared within markup, whether any characters had to be
replaced with REPLACEMENT CHARACTER).
Ntis_html(
t
isinstancetunicodeRR<R tTruetmarkuptoriginal_encodingtdeclared_html_encodingtcontains_replacement_characters(RREtuser_specified_encodingtdocument_declared_encodingt try_encodingstdammit((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pytprepare_markup<75>s  cCsn|j\}}t||<00>}|j|_y|j|<00>Wn,tk
ri}tjtd<00><00>|<00>nXdS(Ns*Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help.(R=R RtfeedRtwarningstwarntRuntimeWarning(RRER>R?tparserR%((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyRN<00>s  
N( R8R9R<tis_xmlR
R t
HTMLPARSERtfeaturesR@RRMRN(((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyR<00>s   sQ\s*((?<=[\'"\s])[^\s/>][^\s/=>]*)(\s*=+\s*(\'[^\']*\'|"[^"]*"|(?![\'"])[^>\s]*))?s<>
<[a-zA-Z][-.a-zA-Z0-9:_]* # tag name
(?:\s+ # whitespace before attribute name
(?:[a-zA-Z_][-.:a-zA-Z0-9_]* # attribute name
(?:\s*=\s* # value indicator
(?:'[^']*' # LITA-enclosed value
|\"[^\"]*\" # LIT-enclosed value
|[^'\">\s]+ # bare value
)
)?
)
)*
\s* # trailing whitespace
(ttagfindtattrfindcCs<>d|_|j|<00>}|dkr(|S|j}|||!|_g}tj||d<17>}|sotd<00><00>|j<00>}||d|!j<00>|_ }x ||kr<>|j
r<>t j||<00>}nt j||<00>}|s<>Pn|j ddd<00>\} }
} |
sd} nX| d dko.| dknsW| d dkoR| dknrg| dd!} n| r|j| <00>} n|j| j<00>| f<00>|j<00>}q<>W|||!j<00>} | dkrv|j<00>\} }d |jkr | |jjd <00>} t|j<00>|jjd <00>}n|t|j<00>}|j
r^|jd |||!d f<16>n|j|||!<21>|S| jd
<00>r<>|j||<00>n/|j||<00>||jkr<>|j|<00>n|S(Niis#unexpected call to parse_starttag()iis'i<><69><EFBFBD><EFBFBD>t"t>s/>s
s junk characters in start tag: %ri(RYs/>(Rt__starttag_texttcheck_for_whole_start_tagtrawdataRVtmatchtAssertionErrortendR6tlasttagR:RWtattrfind_toleranttgrouptunescapetappendtstriptgetpostcountR.trfindterrorRR5thandle_startendtagRtCDATA_CONTENT_ELEMENTStset_cdata_mode(RtitendposR\RR]tkttagtmtattrnametrestRR_tlinenotoffset((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pytparse_starttag<61>s\      $$    cCs2|j<00>|_tjd|jtj<00>|_dS(Ns </\s*%s\s*>(R6t
cdata_elemtretcompiletIt interesting(Rtelem((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyRl<00>s(&t__doc__t__all__RRtsysROt version_infotmajortminortreleaseR;t bs4.elementRRRRRt
bs4.dammitRR t bs4.builderR
R R RTR RRxRyRatVERBOSEtlocatestarttagendt html.parserRVRWRvRlRD(((sZe:\github\Wox.JSONRPC\Output\Debug\PythonHome\lib\site-packages\bs4\builder\_htmlparser.pyt<module>s8    $(S+      7