Resumable Iterations¶
For many download targets, Instaloader is able to resume a
previously-interrupted iteration. It provides an interruptible
Iterator NodeIterator
and a context manager
resumable_iteration()
, which we both present here.
Added in version 4.5.
NodeIterator
¶
- class instaloader.NodeIterator(context: InstaloaderContext, query_hash: str | None, edge_extractor: Callable[[Dict[str, Any]], Dict[str, Any]], node_wrapper: Callable[[Dict], T], query_variables: Dict[str, Any] | None = None, query_referer: str | None = None, first_data: Dict[str, Any] | None = None, is_first: Callable[[T, T | None], bool] | None = None, doc_id: str | None = None)¶
Iterate the nodes within edges in a GraphQL pagination. Instances of this class are returned by many (but not all) of Instaloader’s
Post
-returning functions (such asProfile.get_posts()
etc.).What makes this iterator special is its ability to freeze/store its current state, e.g. to interrupt an iteration, and later thaw/resume from where it left off.
You can freeze a NodeIterator with
NodeIterator.freeze()
:post_iterator = profile.get_posts() try: for post in post_iterator: do_something_with(post) except KeyboardInterrupt: save("resume_information.json", post_iterator.freeze())
and later reuse it with
NodeIterator.thaw()
on an equally-constructed NodeIterator:post_iterator = profile.get_posts() post_iterator.thaw(load("resume_information.json"))
(an appropriate method to load and save the
FrozenNodeIterator
is e.g.load_structure_from_file()
andsave_structure_to_file()
.)A
FrozenNodeIterator
can only be thawn with a matching NodeIterator, i.e. a NodeIterator instance that has been constructed with the same parameters as the instance that is represented by theFrozenNodeIterator
in question. This is to ensure that an iteration cannot be resumed in a wrong, unmatching loop. As a quick way to distinguish iterators that are saved e.g. in files, there is theNodeIterator.magic
string: Two NodeIterators are matching if and only if they have the same magic.See also
resumable_iteration()
for a high-level context manager that handles a resumable iteration.- property count: int | None¶
The
count
as returned by Instagram. This is not always the total count this iterator will yield.
- property first_item: T | None¶
If this iterator has produced any items, returns the first item produced.
It is possible to override what is considered the first item (for example, to consider the newest item in case items are not in strict chronological order) by passing a callback function as the is_first parameter when creating the class.
Added in version 4.8.
Changed in version 4.9.2: What is considered the first item can be overridden.
- freeze() FrozenNodeIterator ¶
Freeze the iterator for later resuming.
- property magic: str¶
Magic string for easily identifying a matching iterator file for resuming (hash of some parameters).
- thaw(frozen: FrozenNodeIterator) None ¶
Use this iterator for resuming from earlier iteration.
- Raises:
If
the iterator on which this method is called has already been used, or
the given
FrozenNodeIterator
does not match, i.e. belongs to a different iteration.
- class instaloader.FrozenNodeIterator(query_hash, query_variables, query_referer, context_username, total_index, best_before, remaining_data, first_node, doc_id)¶
A serializable representation of a
NodeIterator
instance, saving its iteration state.It can be serialized and deserialized with
save_structure_to_file()
andload_structure_from_file()
, as well as withjson
andpickle
thanks to being aNamedTuple
.
resumable_iteration
¶
- instaloader.resumable_iteration(context: InstaloaderContext, iterator: Iterable, load: Callable[[InstaloaderContext, str], Any], save: Callable[[FrozenNodeIterator, str], None], format_path: Callable[[str], str], check_bbd: bool = True, enabled: bool = True) Iterator[Tuple[bool, int]] ¶
High-level context manager to handle a resumable iteration that can be interrupted with a
KeyboardInterrupt
or anAbortDownloadException
.It can be used as follows to automatically load a previously-saved state into the iterator, save the iterator’s state when interrupted, and delete the resume file upon completion:
post_iterator = profile.get_posts() with resumable_iteration( context=L.context, iterator=post_iterator, load=lambda _, path: FrozenNodeIterator(**json.load(open(path))), save=lambda fni, path: json.dump(fni._asdict(), open(path, 'w')), format_path=lambda magic: "resume_info_{}.json".format(magic) ) as (is_resuming, start_index): for post in post_iterator: do_something_with(post)
It yields a tuple (is_resuming, start_index).
When the passed iterator is not a
NodeIterator
, it behaves as ifresumable_iteration
was not used, just executing the inner body.- Parameters:
context – The
InstaloaderContext
.iterator – The fresh
NodeIterator
.load – Loads a FrozenNodeIterator from given path. The object is ignored if it has a different type.
save – Saves the given FrozenNodeIterator to the given path.
format_path – Returns the path to the resume file for the given magic.
check_bbd – Whether to check the best before date and reject an expired FrozenNodeIterator.
enabled – Set to False to disable all functionality and simply execute the inner body.
Changed in version 4.7: Also interrupt on
AbortDownloadException
.