Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
18d7f1d
feat: Moving getting started into actors/running and actors/developme…
mtrunkat Mar 10, 2023
2c13f5d
Renaming AcademyCard to Card component
mtrunkat Mar 10, 2023
0265e4f
Merge branch 'feature/docs-improvements-2' of github.com:apify/apify-…
mtrunkat Mar 10, 2023
4f8215e
fix: academy tutorials sections position
PerVillalva Mar 15, 2023
4c4fe3a
Update sources/academy/tutorials/api/index.md
mtrunkat Mar 16, 2023
75068aa
Update sources/academy/tutorials/api/index.md
mtrunkat Mar 16, 2023
c505907
Update sources/platform/actors/development/index.md
mtrunkat Mar 16, 2023
54ca25d
Update sources/platform/actors/development/index.md
mtrunkat Mar 16, 2023
df54cdc
Update sources/platform/actors/running/index.md
mtrunkat Mar 16, 2023
fe966dd
Update sources/platform/actors/running/index.md
mtrunkat Mar 16, 2023
bb6401d
Update sources/platform/actors/running/index.md
mtrunkat Mar 16, 2023
a856265
Update sources/platform/actors/running/input_and_output.md
mtrunkat Mar 16, 2023
bf6b1cb
Update sources/platform/actors/running/index.md
mtrunkat Mar 16, 2023
4c0cf59
Update sources/platform/actors/development/continuous_integration.md
mtrunkat Mar 16, 2023
f41765a
Update continuous_integration.md
mtrunkat Mar 16, 2023
d130799
Update sources/platform/actors/development/index.md
mtrunkat Mar 16, 2023
dd7cb12
Apply suggestions from code review
mtrunkat Mar 16, 2023
f2e9849
fix: broken link
mnmkng Mar 16, 2023
6f7dfd9
fix outdated examples of running programmatically
mnmkng Mar 17, 2023
ed73fed
fix link to outdated content
mnmkng Mar 17, 2023
b53003d
fix some minor things
mnmkng Mar 17, 2023
26c8699
Update sources/platform/actors/running/index.md
mtrunkat Mar 20, 2023
d7f903a
Update sources/platform/index.mdx
mtrunkat Mar 20, 2023
9563120
Update sources/platform/index.mdx
mtrunkat Mar 20, 2023
f4b8c83
Fixing search vs maps
mtrunkat Mar 20, 2023
8ef3015
Fixing search vs maps
mtrunkat Mar 20, 2023
4c9ade8
Merge branch 'feature/docs-improvements-2' of github.com:apify/apify-…
mtrunkat Mar 20, 2023
595554a
Linting
mtrunkat Mar 20, 2023
037599a
Lint
mtrunkat Mar 20, 2023
f05ffde
Lint
mtrunkat Mar 20, 2023
9281dc7
Update sources/academy/platform/expert_scraping_with_apify/actors_web…
mtrunkat Mar 21, 2023
36cdf40
Apply suggestions from code review
mtrunkat Mar 21, 2023
a9392b3
Apply suggestions from code review
mtrunkat Mar 21, 2023
dd36bd2
Apply suggestions from code review
mtrunkat Mar 21, 2023
e5da3ca
Update sources/platform/homepage_content.json
mtrunkat Mar 21, 2023
0a5ab7e
Update sources/platform/homepage_content.json
mtrunkat Mar 21, 2023
30655f9
feat: Reorganizing access rights (#543)
mtrunkat Mar 21, 2023
919dc6f
Merge branch 'feature/docs-improvements-2' of github.com:apify/apify-…
mtrunkat Mar 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions sources/academy/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ slug: /
displayed_sidebar: courses
hide_table_of_contents: true
---
import AcademyCard from "@site/src/components/AcademyCard";
import Card from "@site/src/components/Card";
import CardGrid from "@site/src/components/CardGrid";
import homepageContent from "./homepage_content.json";

Expand All @@ -20,7 +20,7 @@ Learn everything about web scraping and automation with our free courses that wi
<CardGrid>
{
sections.map((section) =>
<AcademyCard
<Card
title={section.title}
desc={section.description}
imageUrl={section.imageUrl}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Prior to moving forward, please read over these resources:

- Read about [running actors, handling actor inputs, memory and CPU](/platform/actors/running).
- Learn about [actor webhooks](/platform/integrations/webhooks), which we will implement in the next lesson.
- Learn [how to run actors](/platform/tutorials/run-actor-and-retrieve-data-via-api#run-an-actor-or-task) using Apify's REST API.
- Learn [how to run Actors](/academy/api/run-actor-and-retrieve-data-via-api) using Apify's REST API.

## Knowledge check 📝 {#quiz}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ Now we're done, and we can push it up to the Apify platform with the `apify push

## Setting up the webhook {#setting-up-the-webhook}

Since we'll be calling the actor via the [Apify API](/platform/tutorials/run-actor-and-retrieve-data-via-api#run-an-actor-or-task), we'll need to grab hold of the ID of the actor we just created and pushed to the platform. The ID is always accessible through the **Settings** page of the actor.
Since we'll be calling the Actor via the [Apify API](/academy/api/run-actor-and-retrieve-data-via-api), we'll need to grab hold of the ID of the Actor we just created and pushed to the platform. The ID is always accessible through the **Settings** page of the actor.

![Actor ID in actor settings](./images/actor-settings.jpg)

Expand Down
19 changes: 19 additions & 0 deletions sources/academy/tutorials/api/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: API tutorials
description: A collection of various tutorials explaining how to interact with the Apify platform programmatically using its API.
sidebar_position: 20
category: tutorials
slug: /api
---

# API Tutorials 💻📚

**A collection of various tutorials explaining how to interact with the Apify platform programmatically using its API.**

---

This section explains how you can run [Apify Actors](/platform/actors) using Apify's [API](/api/v2), retrieve their results, and integrate them into your own product and workflows. You can do this using a raw HTTP client, or you can benefit from using one of our API clients for:

- [JavaScript](/api/client/js/)
- [Python](/api/client/python)

Original file line number Diff line number Diff line change
@@ -1,17 +1,15 @@
---
title: Run actor and retrieve data via API
description: Learn how to run an actor/task via the Apify API, wait for the job to finish, and retrieve its output data. Your key to integrating actors with your projects.
title: Run Actor and retrieve data via API
description: Learn how to run an Actor/task via the Apify API, wait for the job to finish, and retrieve its output data. Your key to integrating Actors with your projects.
sidebar_position: 6
slug: /tutorials/run-actor-and-retrieve-data-via-api
slug: /api/run-actor-and-retrieve-data-via-api
---

# Run an actor or task and retrieve data via API

**Learn how to run an actor/task via the Apify API, wait for the job to finish, and retrieve its output data. Your key to integrating actors with your projects.**
**Learn how to run an Actor/task via the Apify API, wait for the job to finish, and retrieve its output data. Your key to integrating Actors with your projects.**

---

The most popular way to [integrate](https://help.apify.com/en/collections/1669767-integrating-with-apify) the Apify platform with an external project/application is by programmatically running an [Actor](../actors/index.md) or [task](../actors/running/tasks.md), waiting for it to complete its run, then collecting its data and using it within the project. Though this process sounds somewhat complicated, it's actually quite easy to do; however, due to the plethora of features offered on the Apify platform, new users may not be sure how exactly to implement this type of integration. So, let's dive in and see how you can do it.
The most popular way of [integrating](https://help.apify.com/en/collections/1669767-integrating-with-apify) the Apify platform with an external project/application is by programmatically running an [Actor](/platform/actors) or [task](/platform/actors/running/tasks), waiting for it to complete its run, then collecting its data and using it within the project. Though this process sounds somewhat complicated, it's actually quite easy to do; however, due to the plethora of features offered on the Apify platform, new users may not be sure how exactly to implement this type of integration. So, let's dive in and see how you can do it.

> Remember to check out our [API documentation](/api/v2) with examples in different languages and a live API console. We also recommend testing the API with a nice desktop client like [Postman](https://www.getpostman.com/) or [Insomnia](https://insomnia.rest).

Expand All @@ -24,15 +22,15 @@ If the actor being run via API takes 5 minutes or less to complete a typical run

## Run an Actor or task {#run-an-actor-or-task}

> If you are unsure about the differences between an Actor and a task, you can read about them in the [tasks](../actors/running/tasks.md) documentation. In brief, tasks are just pre-configured inputs for Actors.
> If you are unsure about the differences between an Actor and a task, you can read about them in the [tasks](/platform/actors/running/tasks) documentation. In brief, tasks are just pre-configured inputs for Actors.

The API endpoints and usage (for both sync and async) for [Actors](/api/v2#/reference/actors/run-collection/run-actor) and [tasks](/api/v2#/reference/actor-tasks/run-collection/run-task) are essentially the same.

To run, or **call**, an actor/task, you will need a few things:
To run, or **call**, an Actor/task, you will need a few things:

- The name or ID of the actor/task. The name looks like `username~actorName` or `username~taskName`. The ID can be retrieved on the **Settings** page of the actor/task.
- The name or ID of the Actor/task. The name looks like `username~actorName` or `username~taskName`. The ID can be retrieved on the **Settings** page of the Actor/task.

- Your [API token](../integrations/index.md), which you can find on the **Integrations** page in the [Apify Console](https://console.apify.com/account?tab=integrations) (make sure it does not get leaked anywhere!).
- Your [API token](/platform/integrations), which you can find on the **Integrations** page in [Apify Console](https://console.apify.com/account?tab=integrations) (do not share it with anyone!).

- Possibly an input, which is passed in JSON format as the request's **body**.

Expand Down Expand Up @@ -60,7 +58,7 @@ We can also add settings for the actor (which will override the default settings
https://api.apify.com/v2/acts/ACTOR_NAME_OR_ID/runs?token=YOUR_TOKEN&memory=8192&build=beta
```

This works nearly identically for both actors and tasks; however, for tasks there is no reason to specify a [`build`](../actors/development/builds.md) parameter, as a task already has only one specific actor build which cannot be changed with query parameters.
This works in almost exactly the same way for both Actors and tasks; however, for tasks, there is no reason to specify a [`build`](/platform/actors/development/builds) parameter, as a task already has only one specific Actor build which cannot be changed with query parameters.

### Input JSON {#input-json}

Expand Down Expand Up @@ -94,7 +92,7 @@ If your synchronous run exceeds the 5-minute time limit, the response will be a

### Synchronous runs with dataset output {#synchronous-runs-with-dataset-output}

Most actor runs will store their data in the default [dataset](../storage/dataset.md). The Apify API provides **run-sync-get-dataset-items** endpoints for [actors](/api/v2#/reference/actors/run-actor-synchronously-and-get-dataset-items/run-actor-synchronously-with-input-and-get-dataset-items) and [tasks](/api/v2#/reference/actor-tasks/run-task-synchronously-and-get-dataset-items/run-task-synchronously-and-get-dataset-items-(post)), which allow you to run an actor and receive the items from the default dataset once the run has completed.
Most Actor runs will store their data in the default [dataset](/platform/storage/dataset). The Apify API provides **run-sync-get-dataset-items** endpoints for [actors](/api/v2#/reference/actors/run-actor-synchronously-and-get-dataset-items/run-actor-synchronously-with-input-and-get-dataset-items) and [tasks](/api/v2#/reference/actor-tasks/run-task-synchronously-and-get-dataset-items/run-task-synchronously-and-get-dataset-items-(post)), which allow you to run an Actor and receive the items from the default dataset once the run has finished.

Here is a simple Node.js example of calling a task via the API and logging the dataset items to the console:

Expand Down Expand Up @@ -131,7 +129,7 @@ items.forEach((item) => {

### Synchronous runs with key-value store output {#synchronous-runs-with-key-value-store-output}

[Key-value stores](../storage/key_value_store.md) are useful for storing files like images, HTML snapshots, or JSON data. The Apify API provides **run-sync** endpoints for [actors](/api/v2#/reference/actors/run-actor-synchronously/with-input) and [tasks](/api/v2#/reference/actor-tasks/run-task-synchronously/run-task-synchronously), which allow you to run a specific task and receive the output. By default, they return the `OUTPUT` record from the default key-value store.
[Key-value stores](/platform/storage/key-value-store) are useful for storing files like images, HTML snapshots, or JSON data. The Apify API provides **run-sync** endpoints for [actors](/api/v2#/reference/actors/run-actor-synchronously/with-input) and [tasks](/api/v2#/reference/actor-tasks/run-task-synchronously/run-task-synchronously), which allow you to run a specific task and receive the output. By default, they return the `OUTPUT` record from the default key-value store.

> For more detailed information, check the [API reference](/api/v2#/reference/actors/run-actor-synchronously-and-get-dataset-items/run-actor-synchronously-with-input-and-get-dataset-items).

Expand Down Expand Up @@ -165,13 +163,13 @@ Once again, the final response will be the **run info object**; however, now its

#### Webhooks {#webhooks}

If you have a server, [webhooks](../integrations/webhooks/index.md) are the most elegant and flexible solution for integrations with Apify. You can simply set up a webhook for any actor or task, and that webhook will send a POST request to your server after an [event](../integrations/webhooks/events.md) has occurred.
If you have a server, [webhooks](/platform/integrations/webhooks) are the most elegant and flexible solution for integrations with Apify. You can simply set up a webhook for any Actor or task, and that webhook will send a POST request to your server after an [event](/platform/integrations/webhooks/events) has occurred.

Usually, this event is a successfully finished run, but you can also set a different webhook for failed runs, etc.

![Webhook example](./images/webhook.png)

The webhook will send you a pretty complicated [JSON object](../integrations/webhooks/actions.md), but usually you are only interested in the `resource` object within the response, which is essentially just the **run info** JSON from the previous sections. We can leave the payload template as is as for our example use case, since it is what we need.
The webhook will send you a pretty complicated [JSON object](/platform/integrations/webhooks/actions), but usually, you would only be interested in the `resource` object within the response, which is essentially just the **run info** JSON from the previous sections. We can leave the payload template as is for our example since it is all we need.

Once your server receives this request from the webhook, you know that the event happened, and you can ask for the complete data.

Expand All @@ -195,7 +193,7 @@ Once a status of `SUCCEEDED` or `FAILED` has been received, we know the run has

Unless you used the [synchronous call](#synchronous-flow) mentioned above, you will have to make one additional request to the API to retrieve the data.

The **run info** JSON also contains the IDs of the default [dataset](../storage/dataset.md) and [key-value store](../storage/key_value_store.md) that are allocated separately for each run, which is usually everything you need. The fields are called `defaultDatasetId` and `defaultKeyValueStoreId`.
The **run info** JSON also contains the IDs of the default [dataset](/platform/storage/dataset) and [key-value store](/platform/storage/key-value-store) that are allocated separately for each run, which is usually everything you need. The fields are called `defaultDatasetId` and `defaultKeyValueStoreId`.

#### Retrieving a dataset {#retrieve-a-dataset}

Expand All @@ -219,7 +217,7 @@ https://api.apify.com/v2/datasets/DATASET_ID/items?format=csv&offset=250000

#### Retrieving a key-value store {#retrieve-a-key-value-store}

> [Key-value stores](../storage/key_value_store.md) are mainly useful if you have a single output or any kind of files that cannot be [stringified](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify) (such as images or PDFs).
> [Key-value stores](/platform/storage/key-value-store) are mainly useful if you have a single output or any kind of files that cannot be [stringified](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify) (such as images or PDFs).

When you want to retrieve something from a key-value store, the `defaultKeyValueStoreId` is _not_ enough. You also need to know the name (or **key**) of the record you want to retrieve.

Expand Down
3 changes: 1 addition & 2 deletions sources/academy/tutorials/apify_scrapers/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Apify scrapers
description: Discover Apify's ready-made web scraping and automation tools. Compare Web Scraper, Cheerio Scraper and Puppeteer Scraper to decide which is right for you.
sidebar_position: 3.2
sidebar_position: 13.2
slug: /apify-scrapers
---

Expand Down Expand Up @@ -42,4 +42,3 @@ Puppeteer Scraper is the most powerful scraper tool in our arsenal (aside from d
Puppeteer is a Node.js library, so knowledge of Node.js and its paradigms is expected when working with Puppeteer Scraper.

[Visit the Puppeteer Scraper tutorial to get started!](./puppeteer_scraper.md)

35 changes: 0 additions & 35 deletions sources/platform/about.md

This file was deleted.

Loading