This tutorial is continuation from previous yahoo screen-scraping using PHP4 tutorial. First, a bit knowledge of XPath is required. More about XPATH can be read on: http://www.zvon.org/xxl/XPathTutorial/General/examples.html Also there's small concern that using XPATH is a bit slower than pure DOM Traversal. Read Speed: DOM traversal vs. XPath in PHP 5 Let's start. First we diagnose document structure using Mozilla Firebug. /html/body/center/table[8]/tbody/tr/td[5]/table[4]/tbody/tr/td/font/b Now we get our first XPath query: /html/body/center/table[8]/tr/td[5]/table[4]/tr[1]/td/font Next harder case is to grab contents. /html/body/center/table[8]/tbody/tr/td[5]/table[4]/tbody/tr[2]/td[2]/a/font/b Final XPath query for content is: /html/body/center/table[8]/tr/td[5]/table[4]/tr/td[2]/a/font/b Now final step is to put all two XPath queries into few lines of code, and we're done:
We will try different method using DOM and XPath which only supported in PHP5.
But i personally also think that XPath is neat and easier.
Try a very easy case, which is to grab the title "Top Movies":
Copy XPath using Firebug and get this query:
XPath query from Firebug is:
'php' 카테고리의 다른 글
사이트 긁어오기 (0) | 2012.07.25 |
---|---|
snoopy class를 이용한 youtube 이미지 저장 (php) (0) | 2012.07.25 |
PHP. 웹페이지 자동 로긴해서 긁어 오기와 HTML 파싱 라이브러리 (3) | 2012.07.24 |
mysqli_stmt_bind_param , mysqli .. prepare, bind_param (0) | 2012.07.11 |
php 이미지 경로 지정할때 참고해라 (1) | 2012.07.11 |