Skip to content

DanielCambray/SimplePageCrawler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ZF2 SimplePageCrawler module

Version 0.3.0 Created by Vincent Blanchon

Introduction

SimplePageCrawler is a web page crawler. You can get informations :

  • Title
  • Meta (decsription, open graph, etc.)
  • H1, H2, etc.
  • List of the images
  • List of the links

Usage

Get page informations :

$crawler = $this->getServiceLocator('SimplePageCrawler');
$page = $crawler->get('http://www.nytimes.com');

echo sprintf('The title is "%s"', $page->getTitle());
echo sprintf('The description is "%s"', $page->getMeta('description'));

You can use th action helper :

$page = $this->simplePageCrawler('http://www.nytimes.com');

echo sprintf('The title is "%s"', $page->getTitle());
echo sprintf('The description is "%s"', $page->getMeta('description'));

Advanced usage

You can get Open graph metadatas :

$page = $this->simplePageCrawler('http://www.nytimes.com');
$metas = $page->getMeta()->getOpenGraph();

About

ZF2 module v0.3.0 - Get a page informations : title, meta, heading tags, images & links.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors