Resumable Iterations¶
For many download targets, Instaloader is able to resume a
previously-interrupted iteration. It provides an interruptible
Iterator NodeIterator
and a context manager
resumable_iteration()
, which we both present here.
New in version 4.5.
NodeIterator
¶
-
class
NodeIterator
(context, query_hash, edge_extractor, node_wrapper, query_variables=None, query_referer=None, first_data=None)¶ Iterate the nodes within edges in a GraphQL pagination. Instances of this class are returned by many (but not all) of Instaloader’s
Post
-returning functions (such asProfile.get_posts()
etc.).What makes this iterator special is its ability to freeze/store its current state, e.g. to interrupt an iteration, and later thaw/resume from where it left off.
You can freeze a NodeIterator with
NodeIterator.freeze()
:post_iterator = profile.get_posts() try: for post in post_iterator: do_something_with(post) except KeyboardInterrupt: save("resume_information.json", post_iterator.freeze())
and later reuse it with
NodeIterator.thaw()
on an equally-constructed NodeIterator:post_iterator = profile.get_posts() post_iterator.thaw(load("resume_information.json"))
(an appropriate method to load and save the
FrozenNodeIterator
is e.g.load_structure_from_file()
andsave_structure_to_file()
.)A
FrozenNodeIterator
can only be thawn with a matching NodeIterator, i.e. a NodeIterator instance that has been constructed with the same parameters as the instance that is represented by theFrozenNodeIterator
in question. This is to ensure that an iteration cannot be resumed in a wrong, unmatching loop. As a quick way to distinguish iterators that are saved e.g. in files, there is theNodeIterator.magic
string: Two NodeIterators are matching if and only if they have the same magic.See also
resumable_iteration()
for a high-level context manager that handles a resumable iteration.-
property
count
¶ The
count
as returned by Instagram. This is not always the total count this iterator will yield.
-
property
magic
¶ Magic string for easily identifying a matching iterator file for resuming (hash of some parameters).
- Return type
-
property
first_item
¶ If this iterator has produced any items, returns the first item produced.
New in version 4.8.
- Return type
Optional
[~T]
-
freeze
()¶ Freeze the iterator for later resuming.
- Return type
-
thaw
(frozen)¶ Use this iterator for resuming from earlier iteration.
- Raises
If
the iterator on which this method is called has already been used, or
the given
FrozenNodeIterator
does not match, i.e. belongs to a different iteration.
- Return type
None
-
property
-
class
FrozenNodeIterator
(query_hash, query_variables, query_referer, context_username, total_index, best_before, remaining_data, first_node)¶ A serializable representation of a
NodeIterator
instance, saving its iteration state.It can be serialized and deserialized with
save_structure_to_file()
andload_structure_from_file()
, as well as withjson
andpickle
thanks to being anamedtuple()
.-
best_before
¶ Date when parts of the stored nodes might have expired.
-
context_username
¶ The username who created the iterator, or
None
.
-
first_node
¶ Node data of the first item, if an item has been produced.
-
query_hash
¶ The GraphQL
query_hash
parameter.
-
query_referer
¶ The HTTP referer used for the GraphQL query.
-
query_variables
¶ The GraphQL
query_variables
parameter.
-
remaining_data
¶ The already-retrieved, yet-unprocessed
edges
and thepage_info
at time of freezing.
-
total_index
¶ Number of items that have already been returned.
-
resumable_iteration
¶
-
resumable_iteration
(context, iterator, load, save, format_path, check_bbd=True, enabled=True)¶ High-level context manager to handle a resumable iteration that can be interrupted with a
KeyboardInterrupt
or anAbortDownloadException
.It can be used as follows to automatically load a previously-saved state into the iterator, save the iterator’s state when interrupted, and delete the resume file upon completion:
post_iterator = profile.get_posts() with resumable_iteration( context=L.context, iterator=post_iterator, load=lambda _, path: FrozenNodeIterator(**json.load(open(path))), save=lambda fni, path: json.dump(fni._asdict(), open(path, 'w')), format_path=lambda magic: "resume_info_{}.json".format(magic) ) as (is_resuming, start_index): for post in post_iterator: do_something_with(post)
It yields a tuple (is_resuming, start_index).
When the passed iterator is not a
NodeIterator
, it behaves as ifresumable_iteration
was not used, just executing the inner body.- Parameters
context (
InstaloaderContext
) – TheInstaloaderContext
.iterator (
Iterable
[+T_co]) – The freshNodeIterator
.load (
Callable
[[InstaloaderContext
,str
],Any
]) – Loads a FrozenNodeIterator from given path. The object is ignored if it has a different type.save (
Callable
[[FrozenNodeIterator
,str
],None
]) – Saves the given FrozenNodeIterator to the given path.format_path (
Callable
[[str
],str
]) – Returns the path to the resume file for the given magic.check_bbd (
bool
) – Whether to check the best before date and reject an expired FrozenNodeIterator.enabled (
bool
) – Set to False to disable all functionality and simply execute the inner body.
Changed in version 4.7: Also interrupt on
AbortDownloadException
.