Resumable Iterations¶
For many download targets, Instaloader is able to resume a
previously-interrupted iteration. It provides an interruptible
Iterator NodeIterator and a context manager
resumable_iteration(), which we both present here.
Added in version 4.5.
NodeIterator¶
- class instaloader.NodeIterator(context: InstaloaderContext, query_hash: str | None, edge_extractor: Callable[[Dict[str, Any]], Dict[str, Any]], node_wrapper: Callable[[Dict], T], query_variables: Dict[str, Any] | None = None, query_referer: str | None = None, first_data: Dict[str, Any] | None = None, is_first: Callable[[T, T | None], bool] | None = None, doc_id: str | None = None)¶
Iterate the nodes within edges in a GraphQL pagination. Instances of this class are returned by many (but not all) of Instaloader’s
Post-returning functions (such asProfile.get_posts()etc.).What makes this iterator special is its ability to freeze/store its current state, e.g. to interrupt an iteration, and later thaw/resume from where it left off.
You can freeze a NodeIterator with
NodeIterator.freeze():post_iterator = profile.get_posts() try: for post in post_iterator: do_something_with(post) except KeyboardInterrupt: save("resume_information.json", post_iterator.freeze())
and later reuse it with
NodeIterator.thaw()on an equally-constructed NodeIterator:post_iterator = profile.get_posts() post_iterator.thaw(load("resume_information.json"))
(an appropriate method to load and save the
FrozenNodeIteratoris e.g.load_structure_from_file()andsave_structure_to_file().)A
FrozenNodeIteratorcan only be thawn with a matching NodeIterator, i.e. a NodeIterator instance that has been constructed with the same parameters as the instance that is represented by theFrozenNodeIteratorin question. This is to ensure that an iteration cannot be resumed in a wrong, unmatching loop. As a quick way to distinguish iterators that are saved e.g. in files, there is theNodeIterator.magicstring: Two NodeIterators are matching if and only if they have the same magic.See also
resumable_iteration()for a high-level context manager that handles a resumable iteration.- property count: int | None¶
The
countas returned by Instagram. This is not always the total count this iterator will yield.
- property first_item: T | None¶
If this iterator has produced any items, returns the first item produced.
It is possible to override what is considered the first item (for example, to consider the newest item in case items are not in strict chronological order) by passing a callback function as the is_first parameter when creating the class.
Added in version 4.8.
Changed in version 4.9.2: What is considered the first item can be overridden.
- freeze() FrozenNodeIterator¶
Freeze the iterator for later resuming.
- property magic: str¶
Magic string for easily identifying a matching iterator file for resuming (hash of some parameters).
- thaw(frozen: FrozenNodeIterator) None¶
Use this iterator for resuming from earlier iteration.
- Raises:
If
the iterator on which this method is called has already been used, or
the given
FrozenNodeIteratordoes not match, i.e. belongs to a different iteration.
- class instaloader.FrozenNodeIterator(query_hash, query_variables, query_referer, context_username, total_index, best_before, remaining_data, first_node, doc_id)¶
A serializable representation of a
NodeIteratorinstance, saving its iteration state.It can be serialized and deserialized with
save_structure_to_file()andload_structure_from_file(), as well as withjsonandpicklethanks to being aNamedTuple.
resumable_iteration¶
- instaloader.resumable_iteration(context: InstaloaderContext, iterator: Iterable, load: Callable[[InstaloaderContext, str], Any], save: Callable[[FrozenNodeIterator, str], None], format_path: Callable[[str], str], check_bbd: bool = True, enabled: bool = True) Iterator[Tuple[bool, int]]¶
High-level context manager to handle a resumable iteration that can be interrupted with a
KeyboardInterruptor anAbortDownloadException.It can be used as follows to automatically load a previously-saved state into the iterator, save the iterator’s state when interrupted, and delete the resume file upon completion:
post_iterator = profile.get_posts() with resumable_iteration( context=L.context, iterator=post_iterator, load=lambda _, path: FrozenNodeIterator(**json.load(open(path))), save=lambda fni, path: json.dump(fni._asdict(), open(path, 'w')), format_path=lambda magic: "resume_info_{}.json".format(magic) ) as (is_resuming, start_index): for post in post_iterator: do_something_with(post)
It yields a tuple (is_resuming, start_index).
When the passed iterator is not a
NodeIterator, it behaves as ifresumable_iterationwas not used, just executing the inner body.- Parameters:
context – The
InstaloaderContext.iterator – The fresh
NodeIterator.load – Loads a FrozenNodeIterator from given path. The object is ignored if it has a different type.
save – Saves the given FrozenNodeIterator to the given path.
format_path – Returns the path to the resume file for the given magic.
check_bbd – Whether to check the best before date and reject an expired FrozenNodeIterator.
enabled – Set to False to disable all functionality and simply execute the inner body.
Changed in version 4.7: Also interrupt on
AbortDownloadException.