Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
138 commits
Select commit Hold shift + click to select a range
185ada5
Validator Proxy Response Update (#103)
dylanuys Nov 20, 2024
132dc62
Two new image models: SDXL finetuned on Midjourney, and SD finetuned …
aliang322 Nov 20, 2024
7f8a26a
Added required StableDiffusionPipeline import
aliang322 Nov 20, 2024
ec35faf
Merge pull request #105 from BitMind-AI/expand-models/image-finetuned
aliang322 Nov 21, 2024
c98c08f
Updated transformers version to fix tokenizer initialization error
benliang99 Nov 22, 2024
b3d0a57
Merge pull request #107 from BitMind-AI/transformers-version-update
aliang322 Nov 22, 2024
f2cefdd
GPU Specification (#108)
dylanuys Nov 24, 2024
3097718
Update __init__.py
dylanuys Nov 24, 2024
686fbf7
pulling in latest from main
dylanuys Nov 24, 2024
6060c65
removing logging
benliang99 Nov 24, 2024
5c00f9e
old logging removed
benliang99 Nov 25, 2024
fe9fa6d
adding check for state file in case it is deleted somehow
dylanuys Nov 25, 2024
343b1d6
Merge branch 'testnet' of github.com:BitMind-AI/bitmind-subnet into t…
dylanuys Nov 25, 2024
14a32c3
removing remaining random prompt generation code
benliang99 Nov 25, 2024
a4032dc
Merge branch 'main' into testnet
dylanuys Nov 27, 2024
3866df2
Merge branch 'main' into testnet
dylanuys Dec 2, 2024
4678efa
[Testnet] Video Challenges V1 (#111)
dylanuys Dec 3, 2024
d35af29
fixing image_size for augmentations
dylanuys Dec 3, 2024
3e23a4a
Updated validator gpu requirements (#113)
benliang99 Dec 3, 2024
912c8ab
splitting rewards over image and video (#112)
dylanuys Dec 3, 2024
d22a250
Update README.md (#110)
kenobijon Dec 3, 2024
92dafa4
combining requirements files
dylanuys Dec 3, 2024
fe84b1a
Merge branch 'testnet' of github.com:BitMind-AI/bitmind-subnet into t…
dylanuys Dec 3, 2024
d46694c
Combined requirements installation
benliang99 Dec 3, 2024
3e5c2e2
Improved formatting, added checks to prevent overwriting existing .en…
benliang99 Dec 3, 2024
485f095
Merge pull request #115 from BitMind-AI/combined-requirements
aliang322 Dec 3, 2024
f3edacc
Re-added endpoint options
benliang99 Dec 3, 2024
80a0773
Fixed incorrect diffusers install
benliang99 Dec 4, 2024
f907031
Fixed missing initialization of miner performance trackers
benliang99 Dec 4, 2024
e115a52
Merge pull request #117 from BitMind-AI/video-uat-fixes
aliang322 Dec 4, 2024
0feab36
[Testnet] Docs Updates (#114)
dylanuys Dec 4, 2024
d7c4a1f
Removed deprecated requirements files from github tests (#118)
benliang99 Dec 4, 2024
00a591d
[Testnet] Async Cache Updates (#119)
dylanuys Dec 4, 2024
ae2904f
Increased minimum and recommended storage (#120)
benliang99 Dec 4, 2024
235c483
[Testnet] Data download cleanup (#121)
dylanuys Dec 4, 2024
43f8b71
pep8
dylanuys Dec 4, 2024
9c7e6b3
use png codec, sample by framerate + num frames
dylanuys Dec 4, 2024
979bc86
fps, min_fps, max_fps parameterization of sample
dylanuys Dec 4, 2024
9d5fbb2
return fps and num frames
dylanuys Dec 4, 2024
3add4a8
Merge pull request #122 from BitMind-AI/variable-framerate-sampling
aliang322 Dec 4, 2024
7870207
Fix registry module imports (#123)
dylanuys Dec 5, 2024
571c4c6
Update README.md
dylanuys Dec 5, 2024
0b455a3
README title
dylanuys Dec 5, 2024
ac9f495
removing samples from cache
dylanuys Dec 5, 2024
391b3af
README
dylanuys Dec 5, 2024
e9f47ce
fixing cache removal (#125)
dylanuys Dec 5, 2024
c5c7cde
Fixed tensor not being set to device for video challenges, causing er…
aliang322 Dec 5, 2024
fd5b640
Mainnet Prep (#127)
dylanuys Dec 5, 2024
36b3721
removign old reqs from autoupdate
dylanuys Dec 5, 2024
b772db0
Re-added bitmind HF org prefix to dataset path
benliang99 Dec 5, 2024
dbf1114
Merge pull request #128 from BitMind-AI/open-images-path-fix
aliang322 Dec 5, 2024
93fb6aa
shortening self heal timer
dylanuys Dec 5, 2024
5f849c6
autoupdate
dylanuys Dec 5, 2024
94d2eeb
autoupdate
dylanuys Dec 5, 2024
ec3a8f7
sample size
dylanuys Dec 5, 2024
1a07c68
pulling in hotfix from main
dylanuys Dec 6, 2024
5acb1af
Merge branch 'main' into testnet
dylanuys Dec 10, 2024
2416dbd
Validator Improvements: VRAM usage, logging (#131)
dylanuys Dec 11, 2024
c2cb1c5
version bump
dylanuys Dec 11, 2024
7e527ea
moved info log setting to config.py
dylanuys Dec 11, 2024
649e29a
Merge branch 'main' into testnet
dylanuys Dec 17, 2024
e47c49d
Bittensor 8.5.1 (#133)
dylanuys Dec 18, 2024
ef8e129
Prompt Generation Pipeline Improvements (#135)
dylanuys Jan 5, 2025
3507bad
[testnet] I2i/in painting (#137)
dylanuys Jan 7, 2025
eb73189
Updated image_annotation_generator to prompt_generator (#138)
aliang322 Jan 8, 2025
93e7fc9
bump version 2.0.3 -> 2.1.0
dylanuys Jan 8, 2025
ab0aac8
testing cache clearing via autoupdate
dylanuys Jan 8, 2025
49261cf
Merge branch 'testnet' of github.com:BitMind-AI/bitmind-subnet into t…
dylanuys Jan 8, 2025
7fda1bc
cranking up video rewards to .2
dylanuys Jan 8, 2025
47cfe4e
Merge branch 'main' into testnet
dylanuys Jan 8, 2025
dd30c8a
Add DeepFloyd/IF model and multi-stage pipeline support
aliang322 Jan 10, 2025
941b3ec
Moved multistage pipeline generator to SyntheticDataGenerator
aliang322 Jan 10, 2025
927e932
Args for testing specific model
aliang322 Jan 10, 2025
15e5ffe
[TESTNET] HunyuanVideo (#140)
dylanuys Jan 23, 2025
19850af
merging main
dylanuys Jan 23, 2025
aa88c87
Update __init__.py
dylanuys Jan 23, 2025
304f8f4
updated subnet arch diagram
dylanuys Jan 25, 2025
d9cec1e
README wip
dylanuys Jan 25, 2025
dc2ba6a
docs udpates
dylanuys Jan 25, 2025
10c6c55
README updates
dylanuys Jan 25, 2025
25bb436
README updates
dylanuys Jan 25, 2025
f6d6e0e
more README udpates
dylanuys Jan 25, 2025
da32df2
README updates
dylanuys Jan 25, 2025
3f94567
README udpates
dylanuys Jan 25, 2025
b7a9427
README cleanup
dylanuys Jan 25, 2025
c59e36b
more README updates
dylanuys Jan 25, 2025
bc82324
Fixing table border removal html for github
dylanuys Jan 25, 2025
6815dba
fixing table html
dylanuys Jan 25, 2025
a136012
one last attempt at a prettier table
dylanuys Jan 25, 2025
2c683e8
one last last attempt at a prettier table
dylanuys Jan 25, 2025
7a5d94c
bumping video rewards
dylanuys Jan 25, 2025
79cf10d
removing decay for unsampled miners
dylanuys Jan 26, 2025
6a03333
README cleanup
dylanuys Jan 26, 2025
a6e0ee9
increasing suggested and min compute for validators
dylanuys Jan 27, 2025
655bc2b
README update, markdown fix in Incentive.md
dylanuys Jan 27, 2025
4cb3bd6
README tweak
dylanuys Jan 27, 2025
d84b4cc
removing redundant dereg check from update_scores
dylanuys Jan 27, 2025
4550958
Deepfloyed specific configs, args for better cache/data gen testing, …
aliang322 Jan 28, 2025
ae11857
Merge testnet into t2i/deepfloyed-if
aliang322 Jan 28, 2025
4f24e5b
use largest deepfloyed-if I and II models, ensure no watermarker
aliang322 Jan 30, 2025
b930fbf
Fixed FLUX resolution format, added back model_id and scheduler loadi…
aliang322 Jan 30, 2025
5f7d139
Add Janus-Pro-7B t2i model with custom diffuser pipeline class
aliang322 Jan 30, 2025
7dbc0c7
Janus repo install
aliang322 Jan 30, 2025
bbceeca
Removed custom wrapper files, added Janus DiffusionPipeline wrapper t…
aliang322 Jan 30, 2025
4ac1a4e
Removed DiffusionPipeline import
aliang322 Jan 30, 2025
655e1f5
Uncomment wandb inits
aliang322 Jan 30, 2025
098a4b1
Move create_pipeline_generator() to model utils
aliang322 Feb 3, 2025
4dd1c79
Moved model optimizations to model utils
aliang322 Feb 3, 2025
1e0bbc3
Merge pull request #147 from BitMind-AI/t2i/deepfloyed-if-and-janus-pro
aliang322 Feb 3, 2025
c7a23dd
[Testnet] Mutli-Video Challenges (#148)
dylanuys Feb 6, 2025
5a64f22
merging main
dylanuys Feb 6, 2025
cbce647
Update config.py
dylanuys Feb 6, 2025
a7561fa
explicit requirements install
dylanuys Feb 6, 2025
f172cbb
moving pm2 process stopping prior to model verification
dylanuys Feb 7, 2025
6eb5352
fix for no available vidoes in multi-video challenge generation
dylanuys Feb 7, 2025
1ea4f7c
Update forward.py
dylanuys Feb 7, 2025
06cbe62
Merge branch 'main' into testnet
dylanuys Feb 11, 2025
913ee2a
[Testnet] Multiclass Rewards (#150)
dylanuys Feb 18, 2025
094e353
[Testnet] video organics (#151)
dylanuys Feb 19, 2025
aa3fb64
Validator Proxy handling of Multiclass Responses (#153)
dylanuys Feb 20, 2025
d343485
new incentive docs (#154)
dylanuys Feb 20, 2025
807ff35
python-multipart
dylanuys Feb 20, 2025
bf6a629
Merge branch 'main' into testnet
dylanuys Feb 20, 2025
ee11a95
merging main
dylanuys Feb 21, 2025
78635fe
Merge branch 'main' into testnet
dylanuys Feb 26, 2025
c54fbea
Semisynthetic Cache (#158)
dylanuys Mar 10, 2025
1b15ba9
version bump
dylanuys Mar 10, 2025
fcc686e
Changing mutliclass reward weight to .25
dylanuys Mar 10, 2025
b4414c6
uncommenting dlib
dylanuys Mar 10, 2025
36596dc
bittensor==9.0.3
dylanuys Mar 10, 2025
8059507
Handling datasets with few files that don't need regular local updates
dylanuys Mar 10, 2025
d382d55
fixing error logging variable names
dylanuys Mar 10, 2025
0a34dc3
merging main
dylanuys Mar 13, 2025
96b65c4
dreamshaper-8-inpainting (#161)
dylanuys Mar 13, 2025
c7ebe7b
Vali/sd v15 inpainting (#162)
dylanuys Mar 13, 2025
705794b
refresh cache
dylanuys Mar 17, 2025
5b9b820
Merge branch 'main' into testnet
dylanuys Mar 19, 2025
f9f1dbf
[Testnet] Broken Pipes Fix (#166)
dylanuys Mar 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion autoupdate_validator_steps.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,4 @@

echo $CONDA_PREFIX
./setup_env.sh
rm -rf ~/.cache/sn34/
echo "Autoupdate steps complete :)"
2 changes: 1 addition & 1 deletion bitmind/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
# DEALINGS IN THE SOFTWARE.


__version__ = "2.2.3"
__version__ = "2.2.4"
version_split = __version__.split(".")
__spec_version__ = (
(1000 * int(version_split[0]))
Expand Down
372 changes: 372 additions & 0 deletions bitmind/base/bm_dendrite.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,372 @@
import asyncio
import time
import uuid
from typing import Any, AsyncGenerator, Optional, Union, Type, List

import aiohttp
from bittensor_wallet import Keypair, Wallet

from bittensor.core.axon import Axon
from bittensor.core.chain_data import AxonInfo
from bittensor.core.stream import StreamingSynapse
from bittensor.core.synapse import Synapse
from bittensor.utils.btlogging import logging
from bittensor.core.dendrite import Dendrite

class BMDendrite(Dendrite):
"""
Enhanced Dendrite implementation with improved connection pooling and resilience.

This class extends the standard Dendrite to provide better handling of concurrent
connections, automatic retries for common network issues, and batch processing
of multiple axon queries to prevent resource exhaustion.

Args:
wallet (Optional[Union["Wallet", "Keypair"]]): The wallet or keypair used for
signing messages. Same as parent Dendrite.
max_connections (int): Maximum number of total concurrent connections.
max_connections_per_axon (int): Maximum number of concurrent connections per host.
retry_attempts (int): Number of retry attempts for recoverable errors.
batch_size (int): Number of axons to query in a single batch when running async.
keepalive_timeout (float): How long to keep connections alive in the pool (seconds).
"""

def __init__(
self,
wallet: Optional[Union["Wallet", "Keypair"]] = None,
max_connections: int = 100,
max_connections_per_axon: int = 8,
retry_attempts: int = 2,
batch_size: int = 20,
keepalive_timeout: float = 15.0
):
super().__init__(wallet=wallet)

self.max_connections = max_connections
self.max_connections_per_axon = max_connections_per_axon
self.retry_attempts = retry_attempts
self.batch_size = batch_size
self.keepalive_timeout = keepalive_timeout

self._session = None

self._connection_metrics = {
"total_requests": 0,
"retried_requests": 0,
"failed_requests": 0,
"successful_requests": 0,
}

@property
async def session(self) -> aiohttp.ClientSession:
"""
An asynchronous property that provides access to the internal aiohttp client session
with improved connection pooling.

Returns:
aiohttp.ClientSession: The active aiohttp client session instance with custom connection pooling.
"""
if self._session is None:
connector = aiohttp.TCPConnector(
limit=self.max_connections,
limit_per_host=self.max_connections_per_axon,
force_close=False,
enable_cleanup_closed=True,
keepalive_timeout=self.keepalive_timeout
)

self._session = aiohttp.ClientSession(
connector=connector,
timeout=aiohttp.ClientTimeout(
total=None,
connect=5.0,
sock_connect=5.0,
sock_read=10.0
),
raise_for_status=False # handle HTTP status errors within the class
)
return self._session

async def forward(
self,
axons: Union[list[Union["AxonInfo", "Axon"]], Union["AxonInfo", "Axon"]],
synapse: "Synapse" = Synapse(),
timeout: float = 12,
deserialize: bool = True,
run_async: bool = True,
streaming: bool = False,
) -> list[Union["AsyncGenerator[Any, Any]", "Synapse", "StreamingSynapse"]]:
"""
Enhanced forward method with batch processing and improved error handling.

This implementation processes axons in batches when running asynchronously to prevent
overwhelming network resources and connection pools.

Args:
axons: Target axons to query (single axon or list of axons)
synapse: The Synapse object to send
timeout: Maximum time to wait for a response
deserialize: Whether to deserialize the response
run_async: Whether to run queries concurrently
streaming: Whether the response is expected as a stream

Returns:
Response from axons (single response or list of responses)
"""
is_list = True
if not isinstance(axons, list):
is_list = False
axons = [axons]

is_streaming_subclass = issubclass(synapse.__class__, StreamingSynapse)
if streaming != is_streaming_subclass:
logging.warning(
f"Argument streaming is {streaming} while issubclass(synapse, StreamingSynapse) is {synapse.__class__.__name__}. This may cause unexpected behavior."
)
streaming = is_streaming_subclass or streaming

async def query_all_axons(
is_stream: bool,
) -> Union["AsyncGenerator[Any, Any]", "Synapse", "StreamingSynapse"]:
"""Query all axons with improved connection handling."""

async def single_axon_response_with_retry(
target_axon: Union["AxonInfo", "Axon"],
retries: int = 0
) -> Union["AsyncGenerator[Any, Any]", "Synapse", "StreamingSynapse"]:
"""Process a single axon with retry logic for connection errors."""
self._connection_metrics["total_requests"] += 1
try:
if is_stream:
# If in streaming mode, return the async_generator
result = self.call_stream(
target_axon=target_axon,
synapse=synapse.model_copy(), # type: ignore
timeout=timeout,
deserialize=deserialize,
)
self._connection_metrics["successful_requests"] += 1
return result
else:
# If not in streaming mode, simply call the axon and get the response.
result = await self.call(
target_axon=target_axon,
synapse=synapse.model_copy(), # type: ignore
timeout=timeout,
deserialize=deserialize,
)
self._connection_metrics["successful_requests"] += 1
return result
except (aiohttp.ClientOSError, ConnectionResetError, aiohttp.ServerDisconnectedError) as e:
# Retry on common network/connection errors
error_str = str(e)
is_retryable = (
"Broken pipe" in error_str or
"Connection reset" in error_str or
"Server disconnected" in error_str
)

if retries < self.retry_attempts and is_retryable:
backoff_time = 0.1 * (2 ** retries)
logging.debug(
f"Connection error to {target_axon.ip}:{target_axon.port}, "
f"retrying in {backoff_time:.2f}s ({retries+1}/{self.retry_attempts})"
)
self._connection_metrics["retried_requests"] += 1
await asyncio.sleep(backoff_time)
return await single_axon_response_with_retry(target_axon, retries + 1)

self._connection_metrics["failed_requests"] += 1
raise

if not run_async:
return [
await single_axon_response_with_retry(target_axon) for target_axon in axons
]

all_responses = []
for i in range(0, len(axons), self.batch_size):
batch = axons[i:i+self.batch_size]
batch_responses = await asyncio.gather(
*(single_axon_response_with_retry(target_axon) for target_axon in batch),
return_exceptions=True # Don't let one failure block others
)

# Process any exceptions that were captured
for j, response in enumerate(batch_responses):
if isinstance(response, Exception):
failed_synapse = synapse.model_copy()
target_axon = batch[j]
failed_synapse = self.preprocess_synapse_for_request(
target_axon, failed_synapse, timeout
)
failed_synapse = self.process_error_message(
failed_synapse,
failed_synapse.__class__.__name__,
response
)
batch_responses[j] = failed_synapse

all_responses.extend(batch_responses)

return all_responses

responses = await query_all_axons(streaming)
return responses[0] if len(responses) == 1 and not is_list else responses

async def call(
self,
target_axon: Union["AxonInfo", "Axon"],
synapse: "Synapse" = Synapse(),
timeout: float = 12.0,
deserialize: bool = True,
) -> "Synapse":
"""
Enhanced call method with improved error handling for connection issues.

Args:
target_axon: The target axon to query
synapse: The Synapse object to send
timeout: Maximum time to wait for a response
deserialize: Whether to deserialize the response

Returns:
The response Synapse object
"""

start_time = time.time()
target_axon = (
target_axon.info() if isinstance(target_axon, Axon) else target_axon
)

request_name = synapse.__class__.__name__
url = self._get_endpoint_url(target_axon, request_name=request_name)

synapse = self.preprocess_synapse_for_request(target_axon, synapse, timeout)

try:
self._log_outgoing_request(synapse)

try:
async with (await self.session).post(
url=url,
headers=synapse.to_headers(),
json=synapse.model_dump(),
timeout=aiohttp.ClientTimeout(total=timeout),
) as response:
json_response = await response.json()
self.process_server_response(response, json_response, synapse)
except aiohttp.ClientPayloadError as e:
if "Response payload is not completed" in str(e):
synapse.dendrite.status_code = "499"
synapse.dendrite.status_message = f"Incomplete response payload: {str(e)}"
else:
raise
except aiohttp.ClientOSError as e:
if "Broken pipe" in str(e):
synapse.dendrite.status_code = "503"
synapse.dendrite.status_message = f"Connection broken: {str(e)}"
else:
raise

synapse.dendrite.process_time = str(time.time() - start_time)

except Exception as e:
synapse = self.process_error_message(synapse, request_name, e)

finally:
self._log_incoming_response(synapse)
self.synapse_history.append(Synapse.from_headers(synapse.to_headers()))
return synapse.deserialize() if deserialize else synapse

async def call_stream(
self,
target_axon: Union["AxonInfo", "Axon"],
synapse: "StreamingSynapse" = Synapse(),
timeout: float = 12.0,
deserialize: bool = True,
) -> "AsyncGenerator[Any, Any]":
"""
Enhanced call_stream method for streaming responses with improved error handling.

Args:
target_axon: The target axon to query
synapse: The Synapse object to send
timeout: Maximum time to wait for initial response
deserialize: Whether to deserialize the response

Yields:
Response chunks from the streaming endpoint
"""
start_time = time.time()
target_axon = (
target_axon.info() if isinstance(target_axon, Axon) else target_axon
)

request_name = synapse.__class__.__name__
endpoint = (
f"0.0.0.0:{str(target_axon.port)}"
if target_axon.ip == str(self.external_ip)
else f"{target_axon.ip}:{str(target_axon.port)}"
)
url = f"http://{endpoint}/{request_name}"

synapse = self.preprocess_synapse_for_request(target_axon, synapse, timeout)

try:
self._log_outgoing_request(synapse)
stream_timeout = aiohttp.ClientTimeout(
total=None,
connect=10.0,
sock_connect=10.0,
sock_read=timeout
)

async with (await self.session).post(
url,
headers=synapse.to_headers(),
json=synapse.model_dump(),
timeout=stream_timeout,
) as response:
try:
async for chunk in synapse.process_streaming_response(response):
yield chunk
except (aiohttp.ClientPayloadError, aiohttp.ClientOSError) as e:
error_msg = str(e)
if "Broken pipe" in error_msg or "incomplete" in error_msg.lower():
logging.warning(f"Streaming interrupted: {error_msg}")
# The stream was interrupted, but we might have received partial data, so continue

json_response = synapse.extract_response_json(response)
self.process_server_response(response, json_response, synapse)

synapse.dendrite.process_time = str(time.time() - start_time)

except Exception as e:
synapse = self.process_error_message(synapse, request_name, e)

finally:
self._log_incoming_response(synapse)
self.synapse_history.append(Synapse.from_headers(synapse.to_headers()))
if deserialize:
yield synapse.deserialize()
else:
yield synapse

def get_connection_metrics(self) -> dict:
"""
Get metrics about connection usage and errors.

Returns:
dict: A dictionary containing connection metrics
"""
return self._connection_metrics.copy()

def reset_connection_metrics(self) -> None:
"""Reset all connection metrics counters"""
self._connection_metrics = {
"total_requests": 0,
"retried_requests": 0,
"failed_requests": 0,
"successful_requests": 0,
}
Loading