Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
245 commits
Select commit Hold shift + click to select a range
b9ac387
[Autoscaler] Display node status tag in autsocaler status (#13561)
Jan 21, 2021
daf0bef
[RLlib] Dreamer: Fix broken import and add compilation test case. (#1…
sven1977 Jan 21, 2021
d11e62f
[RLlib] Fix problem in preprocessing nested MultiDiscrete (#13308)
saeid93 Jan 21, 2021
587f207
[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550)
michaelzhiluo Jan 21, 2021
92f1e09
[Java] Fix return of java doc (#13601)
kfstorm Jan 21, 2021
a82fa80
Inline small objects in GetObjectStatus response. (#13309)
clarkzinzow Jan 21, 2021
6803874
[serve] Refactor BackendState to use ReplicaState classes (#13406)
ijrsvt Jan 21, 2021
87ca102
[Kubernetes] Unit test for cluster launch and teardown using K8s Oper…
DmitriGekhtman Jan 21, 2021
20acc3b
Revert "Inline small objects in GetObjectStatus response. (#13309)" (…
amogkam Jan 22, 2021
ccc901f
add 3.8 (#13608)
amogkam Jan 22, 2021
0998d69
[core] Admission control for pulling objects to the local node (#13514)
stephanie-wang Jan 22, 2021
4e01a9e
[Autoscaler] Ensure ubuntu is owner of docker host mount folder (#13579)
nikitavemuri Jan 22, 2021
1fbb752
[autoscaler] remove worker_default_node_type that is useless. (#13588)
Jan 22, 2021
4ecd29e
[dashboard] Fixes dashboard issues when environments have set http_pr…
ConeyLiu Jan 22, 2021
aa5d7a5
[Dashboard]Don't set node actors when node_id of actor is Nil (#13573)
WangTaoTheTonic Jan 22, 2021
39755fd
Revert "[Serve] Refactor BackendState" (#13626)
amogkam Jan 22, 2021
00c14ce
[Object Spilling] Skip flaky tests (#13628)
amogkam Jan 22, 2021
90f1e40
[Java] Add `fetchLocal` parameter in `Ray.wait()` (#13604)
kfstorm Jan 22, 2021
da59283
[Metrics] Cache metrics ports in a file at each node (#13501)
architkulkarni Jan 22, 2021
d629292
[RLlib] Add grad_clip config option to MARWIL and stabilize grad clip…
sven1977 Jan 22, 2021
7fec19d
[kubernetes][operator][minutiae] Backwards compatibility of operator …
DmitriGekhtman Jan 22, 2021
c4a7103
Revert "[dashboard] Fix RAY_RAYLET_PID KeyError on Windows (#12948)" …
amogkam Jan 22, 2021
0c3d9a3
[Metrics] Fix serialization for custom metrics (#13571)
architkulkarni Jan 22, 2021
25e1b78
[Dependencies] Move requirements.txt to requirements directory. (#13636)
amogkam Jan 23, 2021
01d74af
[horovod] Horovod+Ray Pytorch Lightning Accelerator (#13458)
amogkam Jan 23, 2021
8ef835f
Remove idle actor from worker pool. (#13523)
jovany-wang Jan 23, 2021
17760e1
[tune] update Optuna integration to 2.4.0 API (#13631)
krfricke Jan 23, 2021
b7dd7dd
deprecate useless fields in the cluster yaml. (#13637)
Jan 23, 2021
e675e5b
[ray_client]: Add more retry logic (#13478)
barakmich Jan 24, 2021
edbb293
[Object Spilling] Multi node file spilling V2. (#13542)
rkooo567 Jan 24, 2021
4dabf01
Close #12031 (Autoscaler is overriding your resource for same quantit…
Jan 25, 2021
e9103ee
[Java] [Test] Move multi-worker config to ray.conf file (#13583)
kfstorm Jan 25, 2021
9423930
[RLlib] MAML: Add cartpole mass test for PyTorch. (#13679)
sven1977 Jan 25, 2021
964689b
[RLlib] Fix bug in ModelCatalog when using custom action distribution…
janblumenkamp Jan 25, 2021
b4702de
[RLlib] move evaluation to trainer.step() such that the result is pro…
Maltimore Jan 25, 2021
db2c836
[Placement Group] Move PlacementGroup public method to interface. (#1…
clay4megtr Jan 25, 2021
f9f2bfa
[Metric] Fix crashed when register metric view in multithread (#13485)
ashione Jan 25, 2021
7920911
[kubernetes][operator][hotfix] Dictionary fix (#13663)
DmitriGekhtman Jan 25, 2021
1c77cc7
[docs] Remove API warning from mp.Pool (#13683)
edoakes Jan 25, 2021
d96a9fa
Revert "Revert "[dashboard] Fix RAY_RAYLET_PID KeyError on Windows (#…
amogkam Jan 25, 2021
9feae90
skip test_spill (#13693)
amogkam Jan 25, 2021
0d75f37
[tune](deps): Bump distributed in /python/requirements (#13643)
dependabot[bot] Jan 25, 2021
8b8d6b9
[Buildkite] Add all Python tests (#13566)
simon-mo Jan 26, 2021
fe8262a
Add K8s test to release process (#13694)
simon-mo Jan 26, 2021
f2867b0
[CI] Remove object_manager_test (#13703)
simon-mo Jan 26, 2021
840987c
Scalability Envelope Tests (#13464)
Jan 26, 2021
7a78f4e
[Collective][PR 4/6] NCCL Communicator caching and preliminary stream…
zhisbug Jan 26, 2021
ef1f7e4
[tune](deps): Bump smart-open[s3] in /python/requirements (#13699)
dependabot[bot] Jan 26, 2021
148b102
[tune](deps): Bump autogluon-core in /python/requirements (#13698)
dependabot[bot] Jan 26, 2021
5d882b0
[Serve] fix k8s doc (#13713)
edoakes Jan 26, 2021
4aff86b
[CI] skip failing java tests (#13702)
amogkam Jan 26, 2021
ddcbd22
Rename the ray.operator module to ray.ray_operator (#13705)
DmitriGekhtman Jan 26, 2021
5d82654
[CLI] Fix Ray Status with ENV Variable set (#13707)
ijrsvt Jan 26, 2021
0c46d09
[ray_client]: Monitor client stream errors (#13386)
barakmich Jan 26, 2021
6b477dd
[CI] Split test_multi_node to avoid timeouts (#13712)
amogkam Jan 26, 2021
f490e2b
[ray_client] Fix and extend get_actor test to detached actors (#13016)
barakmich Jan 26, 2021
ab6a634
[Serve] Revert "Revert "[Serve] Refactor BackendState" (#13626) (#13697)
ijrsvt Jan 26, 2021
2f48219
Revert "[CLI] Fix Ray Status with ENV Variable set (#13707)" (#13719)
simon-mo Jan 26, 2021
4f4e1b6
Fix multiprocessing starmap to allow passing in zip (#13664)
randxie Jan 26, 2021
4db0a31
[Core] Better error if /dev/shm is too small (#13624)
ijrsvt Jan 26, 2021
9cf0c49
[CI] Skip test_multi_node_3 on Windows (#13723)
simon-mo Jan 27, 2021
8baafac
[Logging] Log rotation config (#13375)
rkooo567 Jan 27, 2021
d2963f4
[Object Spilling] Clean up FS storage upon sigint for ray.init(). (#1…
rkooo567 Jan 27, 2021
7f6d326
[Placement Group]Add detached support for placement group. (#13582)
clay4megtr Jan 27, 2021
2664a2a
[tune] fix non-deterministic category sampling by switching back to `…
krfricke Jan 27, 2021
c5b645e
[tune] add type hints to tune.run(), fix abstract methods of Progress…
krfricke Jan 27, 2021
2d34e95
Don't gather check_parent_task on Windows, since it's undefined. (#13…
clarkzinzow Jan 27, 2021
06fac78
[serve] Fix whacky worker replica failure test (#13696)
edoakes Jan 27, 2021
202fbdf
[Serve] Fix ServeHandle serialization (#13695)
architkulkarni Jan 27, 2021
eba698d
Remove docs for install-nightly (#13744)
ericl Jan 27, 2021
b4bcb9b
[Docker] Use Cuda 11 (#13691)
ijrsvt Jan 27, 2021
c5209e2
[Docker] default to /home/ray (#13738)
ijrsvt Jan 27, 2021
56a9523
Fix high CPU usage in object manager due to O(n^2) iteration over act…
ericl Jan 27, 2021
3644df4
[CI] Add retry to java doc test (#13743)
simon-mo Jan 27, 2021
c0fe816
[Core/Autoscaler] Properly clean up resource backlog from (#13727)
Jan 27, 2021
bdf0c00
Revert "Revert "[CLI] Fix Ray Status with ENV Variable set (#13707) (…
ijrsvt Jan 27, 2021
32ec0d2
[Object Spilling] Remove job id from the io worker log name. (#13746)
rkooo567 Jan 28, 2021
25fa391
[Core] Add private on_completed callback for ObjectRef (#13688)
simon-mo Jan 28, 2021
28cf5f9
[docs] change MLFlow to MLflow in docs (#13739)
architkulkarni Jan 28, 2021
40234ad
[autoscaler][AWS] Make sure subnets belong to same VPC as user-specif…
DmitriGekhtman Jan 28, 2021
0e7343e
[docs] Fix MLflow / Tune example in documentation (#13740)
zhe-thoughts Jan 28, 2021
2e01d5d
Report failed deserialization of errors in Ray client
ericl Jan 28, 2021
c10abbb
Revert "[Serve] Fix ServeHandle serialization (#13695)" (#13753)
simon-mo Jan 28, 2021
4f1f558
[Core] Hotfix Windows Compilation Error for ClusterTaskManager (#13754)
simon-mo Jan 28, 2021
cb95ff1
[Serve] Add "endpoint registered" message to router log (#13752)
architkulkarni Jan 28, 2021
56ee6ef
[GCS]only update states related fields when publish actor table data …
WangTaoTheTonic Jan 28, 2021
d4ef5c5
[RLlib] Atari-RAM-Preprocessing, unsigned observation vector results …
cathrinS Jan 28, 2021
b01b0f8
[RLlib] Fix multiple Unity3DEnvs trying to connect to the same custom…
yurirocha15 Jan 28, 2021
c583113
[Ax] Align optimization mode and reported SEM with Ax (#13611)
lena-kashtelyan Jan 28, 2021
4bc257f
[RLlib] Fix custom multi action distr (#13681)
sven1977 Jan 28, 2021
cb771f2
[Serve] Add ServeHandle metrics (#13640)
architkulkarni Jan 28, 2021
0c906a8
[Docker] usage of python-version (#13011)
TanjaBayer Jan 28, 2021
813a7ab
[docker] Build Python3.6 & Python3.8 Docker Images (#13548)
ijrsvt Jan 28, 2021
42d501d
[core] Pin arguments during task execution (#13737)
stephanie-wang Jan 29, 2021
752da83
[Dashboard] Add the new dashboard code and prompt users to try it (#1…
mxz96102 Jan 29, 2021
0f3a3e1
Only delete local object in CoreWorkerPlasmaStoreProvider:::WarmupSto…
raulchen Jan 29, 2021
9a41314
[tune] dynamic global checkpointing interval (#13736)
krfricke Jan 29, 2021
4d6817c
[autoscaler] Better validation for min_workers and max_workers (#13779)
Jan 29, 2021
b20a38f
[autoscaler] Avoid launching GPU nodes when the workload only has CPU…
ericl Jan 29, 2021
0b598c0
[Serialization] API for deregistering serializers; code & doc cleanup…
suquark Jan 29, 2021
1a9a002
[Wheel] Build Py36 & Py38 in separate deploy (#13797)
ijrsvt Jan 29, 2021
c21a79a
[Object Spilling] 100GB shuffle release test (#13729)
rkooo567 Jan 29, 2021
9441f85
[client] Hook runtime context (#13750)
barakmich Jan 29, 2021
5080802
Revert "[autoscaler] Better validation for min_workers and max_worker…
simon-mo Jan 29, 2021
1946567
[CI] Deflake test_basics and skip test_component_failures_3 (#13801)
simon-mo Jan 29, 2021
a3796b3
[CI] Add other Travis Linux builds to buildkite (#13769)
simon-mo Jan 29, 2021
30f8232
[core] Add debug information for the PullManager and LocalObjectManag…
stephanie-wang Jan 30, 2021
4b60c38
[Dashboard] fix new dashboard entrance and some table problem (#13790)
mxz96102 Jan 30, 2021
660857f
Fix windows test (#13811)
Jan 30, 2021
b5f0aed
[Log] use default stderr logger if no raylog starting (#13762)
ashione Feb 1, 2021
2ba77ae
[Release] Fix SGD+Tune long running distributed release test (#13812)
amogkam Feb 1, 2021
d1ec787
[Object Spilling] Turn on by default. (#13745)
rkooo567 Feb 1, 2021
9d7b8b5
[autoscaler] Remove min workers from multi node type examples (#13814)
Feb 1, 2021
1d2ab01
Use right reserve size (#13829)
WangTaoTheTonic Feb 1, 2021
361e5f0
support dynamic library loading in C++ worker (#13734)
SongGuyang Feb 1, 2021
6e53a71
bug fix for doc (#13834)
SongGuyang Feb 1, 2021
754bee9
[core][object spillin] Fix bugs in admission control (#13781)
stephanie-wang Feb 1, 2021
55566bc
[ray_client]: Add python version check and test (and some minor fixes…
barakmich Feb 1, 2021
1ee5d5f
[AWS] Fill-in AMI if not provided (#13808)
ijrsvt Feb 1, 2021
26ba95e
[python/ray]: add cloudpickle dependency (#13838)
barakmich Feb 1, 2021
e4d3043
Fix naming of ray_spilled_objects directory
ericl Feb 1, 2021
886217c
[Object Spilling] Skip normal ray.get path when spilling objects. (#…
rkooo567 Feb 2, 2021
d71eeac
remove lru evict docs (#13849)
ericl Feb 2, 2021
88ab887
Unconditionally retry all RPC errors on client connect (#13845)
ericl Feb 2, 2021
26beb3b
Revert "Revert "Enable Ray client server by default (#13350)" (#13429…
ericl Feb 2, 2021
fa42900
Add Ray client protocol version (#13846)
ericl Feb 2, 2021
52c94b7
[RLlib] Allow SAC to use custom models as Q- or policy nets and depre…
sven1977 Feb 2, 2021
0c93bb7
[RLlib] Update Documentation for Curiosity's support of continuous ac…
QuantumMecha Feb 2, 2021
714c367
[RLlib] Trainer._validate_config idempotentcy correction (issue 13427…
raoul-khour-ts Feb 2, 2021
b9c15a2
[RLlib] Issue #13761: Fix get action shape (#13764)
stanislav-chekmenev Feb 2, 2021
d29fcfb
[tune] catch SIGINT signal and trigger experiment checkpoint (#13767)
krfricke Feb 2, 2021
a6138ca
[serve] Support batches for ImportedBackends (#13843)
edoakes Feb 2, 2021
0a0d918
[RLlib] Trajectory view API example script (enhancements and tf2 supp…
sven1977 Feb 2, 2021
9ac7315
[RLlib] Unify fcnet initializers for the value output layer (std=1.0 …
sven1977 Feb 2, 2021
863c1b8
Add podman support (#13633)
jCrompton Feb 2, 2021
fc956e0
[Hotfix] Lint (#13864)
edoakes Feb 2, 2021
32fc649
[serve] Add example code for custom status code response (#13868)
architkulkarni Feb 2, 2021
c8e1f07
remove starlette install instruction (#13869)
architkulkarni Feb 2, 2021
b4684cf
Fix bug that otal_commands_queued_ is not initialized (#13852)
ffbin Feb 3, 2021
d335ce2
Move the tune driver into a remote task (#13778)
ericl Feb 3, 2021
2a903b9
[joblib] Log once the context warning argument. (#13865)
Feb 3, 2021
a695c65
[serve] Small cleanups for BackendState (#13870)
edoakes Feb 3, 2021
875ea3f
[docs] Update actors.rst (#13873)
harryge00 Feb 3, 2021
7931045
Enabling the cancellation of non-actor tasks in a worker's queue 2 (#…
Feb 3, 2021
f14171c
[Core] Put raylet ip's in resource usage report (#13871)
Feb 3, 2021
77ee2c5
[ray_client] convert things registered for ray into ray_client (#13639)
barakmich Feb 3, 2021
cb9fa90
[Object Spilling] Add consumed bytes to detect thrashing. (#13853)
rkooo567 Feb 3, 2021
407302f
[Core] Ownership-based Object Directory - Changed infinite short-poll…
clarkzinzow Feb 3, 2021
e8fce9f
Check Ray client protocol version (#13886)
ericl Feb 4, 2021
1187d1d
[autoscaler][kubernetes][operator] Rudimentary error handling, make "…
DmitriGekhtman Feb 4, 2021
e0d9c8f
Always replace DEL with UNLINK (#13832)
WangTaoTheTonic Feb 4, 2021
44aa9c1
Rename timeout to period with heartbeat interval (#13872)
WangTaoTheTonic Feb 4, 2021
a13208f
Scalability envelope readme typo (#13874)
Feb 4, 2021
243f678
Fall back to random port instead of default port for non-primary Redi…
clarkzinzow Feb 4, 2021
e79a380
Check in shuffle code as experimental (#13899)
ericl Feb 4, 2021
0fc81e2
[tune] fix gpu check (#13825)
richardliaw Feb 4, 2021
6c77aeb
[docs] ray slack remove banners (#13898)
richardliaw Feb 4, 2021
1e113d2
[tune/xgboost] Update release test docs (#13880)
krfricke Feb 4, 2021
db59736
[autoscaler][kubernetes] Add ability to not copy cluster config to he…
DmitriGekhtman Feb 4, 2021
7af0c99
[serve] Built-in support for imported backends (#13867)
edoakes Feb 4, 2021
e89bbcb
[Serve] Revert "Revert "[Serve] Fix ServeHandle serialization"" and d…
architkulkarni Feb 4, 2021
982c606
Add more user-friendly error message upon `async def` remote task (#1…
kathryn-zhou Feb 5, 2021
40bad86
[hotfix][test][windows] Exclude k8s operator mock test from build. (#…
DmitriGekhtman Feb 5, 2021
fb89f9c
[Placement Group] Support named placement group (#13755)
clay4megtr Feb 5, 2021
8a5999c
[GCS]Fix bug that gcs client does not set last_resource_usage_ (#13856)
ffbin Feb 5, 2021
eee624c
Revert "Fix passing env on windows (#13253)" (#13828)
fyrestone Feb 5, 2021
f782ed5
Ray client version check strict eq (#13926)
ericl Feb 5, 2021
f44f368
[Tune] Add try-except to FailureInjectorCallback (#13939)
amogkam Feb 5, 2021
4a3dd68
Buildkite determine-to-run support (#13866)
simon-mo Feb 5, 2021
e1a5e5b
Fix test_actor_restart (#13901)
raulchen Feb 5, 2021
cbd3598
[tune] Fixed wait_for_gpu to handle str representations of ordinal ID…
tgaddair Feb 5, 2021
ea4154d
[Hotfix] Master compilation error on MacOS. (#13946)
simon-mo Feb 6, 2021
f070b3c
[dask-on-ray] Fix Dask-on-Ray test: Python 3 dictionary .values() is …
clarkzinzow Feb 6, 2021
1412f3c
[docs] page for using Modin with Ray (#13937)
devin-petersohn Feb 6, 2021
4b49414
[Java] fix actor restart failure when multi-worker is turned on (#13793)
kfstorm Feb 7, 2021
3a230fa
[ray_client] close ray connection upon client deactivation (#13919)
richardliaw Feb 7, 2021
7231b6b
[core/client] enable more tests (#13961)
richardliaw Feb 8, 2021
918ad84
[core] Java worker should respect the user provided node_ip_address (…
ConeyLiu Feb 8, 2021
bcf9457
[Java] fix test hang occasionally when running FailureTest (#13934)
kfstorm Feb 8, 2021
d001af3
[RLlib] Allow `rllib rollout` to run distributed via evaluation worke…
sven1977 Feb 8, 2021
ebeee1d
[RLlib] Pytorch MAML fix for more than two workers with discrete acti…
ChaceAshcraft Feb 8, 2021
eb00386
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics…
sven1977 Feb 8, 2021
0e07b5f
[Doc] Update actor resource information (#13909)
rkooo567 Feb 8, 2021
ec94214
Revert "[Java] fix test hang occasionally when running FailureTest (#…
simon-mo Feb 8, 2021
09242e6
random a job id in c++ worker (#13982)
SongGuyang Feb 8, 2021
1643bc5
Fix autoscaler wrong parameter names (#13966)
Feb 8, 2021
081f3e5
[autoscaler][kubernetes] Ray client setup, example config simplificat…
DmitriGekhtman Feb 9, 2021
914696a
Skip placement tests on Windows (#14000)
simon-mo Feb 9, 2021
2092b09
[Core]Fix ray.kill doesn't cancel pending actor bug (#13254)
ffbin Feb 9, 2021
d7301a5
[RLlib]: Trajectory View API: Keep env infos (e.g. for postprocessing…
sven1977 Feb 9, 2021
3c8b164
[tune] pass trainable function name when using `tune.with_parameters`…
krfricke Feb 9, 2021
43083b9
[docs] optuna variable typo (#14006)
Crissman Feb 9, 2021
1dcdfe9
[autoscaler/dashboard] Publish resource usage in units of bytes (#14002)
Feb 9, 2021
f51c26b
Revert "[Core]Fix ray.kill doesn't cancel pending actor bug (#13254)"…
simon-mo Feb 9, 2021
e0b8179
Revert "Revert "[Java] fix test hang occasionally when running Failur…
kfstorm Feb 9, 2021
79c7c18
[dask-on-ray] Add multiple return DataFrame shuffle optimization. (#1…
clarkzinzow Feb 9, 2021
7f342eb
Update example shuffle script (#14021)
ericl Feb 10, 2021
7a6f805
[Autoscaler] Monitor refactor for backward compatability. (#13970)
Feb 10, 2021
8b7cf7c
Add tip on how to disable Ray OOM handler (#14017)
ericl Feb 10, 2021
8ca0a32
HotFix k8s autoscaling (#14024)
DmitriGekhtman Feb 10, 2021
ce80ef5
[Docs] RayDP Documentation (#14018)
Feb 10, 2021
1754359
[Core]Fix ray.kill doesn't cancel pending actor bug (#14025)
ffbin Feb 10, 2021
37c7daa
[RLlib] DDPG: Support simplex action space. (#14011)
sven1977 Feb 10, 2021
81e7434
[RLlib] TFPolicy.export_model: Add timestep placeholder to model's si…
sven1977 Feb 10, 2021
1ef2a67
[tune] add scalability release tests (#13986)
krfricke Feb 10, 2021
68e985d
[hotfix][docs] RayDP tensorflow != pytorch (#14044)
Feb 10, 2021
6f9d39f
Revert "[Autoscaler] Monitor refactor for backward compatability. (#1…
architkulkarni Feb 10, 2021
fc89984
Subtract from num bytes in use (#13944)
stephanie-wang Feb 10, 2021
75fbd48
[doc] Minor fix to indentation (#14040)
thomasjpfan Feb 10, 2021
05ab75f
[docs] Add mode to Ray Tune quick start (#14023)
Crissman Feb 10, 2021
c5574a3
[dask-on-ray] Add better Dask-on-Ray example, and detail custom shuff…
clarkzinzow Feb 10, 2021
d87a82e
Revert "Revert "[Autoscaler] Monitor refactor for backward compat…
Feb 11, 2021
f6cfc44
[autoscaler] run setup commands with restart_only=True (#13836)
ijrsvt Feb 11, 2021
a2f7998
[RLlib] Issue #13342: Add `validate_spaces` to MB-MPO. (#14038)
sven1977 Feb 11, 2021
4db8640
[RLlib] Issue #13507: Fix MB-MPO CartPole Env's reward function as we…
sven1977 Feb 11, 2021
cd7e567
[Core] Ownership-based Object Directory - Added support for object sp…
clarkzinzow Feb 11, 2021
cb8523a
Fix the wrong spark on ray link. (#14057)
rkooo567 Feb 11, 2021
2af1f06
Fix broken link to Flow docs (#14058)
jeroenboeye Feb 11, 2021
a430ac2
[Tune] Revert Pinning Tune Dependencies (#14059)
amogkam Feb 11, 2021
24e020b
[Doc] Add PTL and RAG to community integrations (#14064)
amogkam Feb 11, 2021
02938f3
[hotfix] Disable dashboard agent windows (#14062)
Feb 12, 2021
6644a0f
[autoscaler][kubernetes][docs] Updated Kubernetes Documentation (#14016)
DmitriGekhtman Feb 12, 2021
936cb59
[RLlib] Issue #13646: Rewards still not available in loss/json-output…
sven1977 Feb 12, 2021
c7ff69f
[OBOD] Add support for ownership-based object directory object recove…
clarkzinzow Feb 12, 2021
c9a9d42
[OBOD] Disable the ownership-based object directory for all tests tha…
clarkzinzow Feb 12, 2021
20f6cc2
skip test_basic_reconstruction_put on win (#14082)
architkulkarni Feb 12, 2021
ff1b262
[operator] expose RAY_CONFIG_DIR env var (fix #14074) (#14076)
erikerlandson Feb 13, 2021
9dc671a
Unhandled exception handler based on local ref counting (#14049)
ericl Feb 13, 2021
5636af8
[hotfix] Fix mac build (#14075)
Feb 14, 2021
75568f8
skip restart and multi restart test on win (#14084)
architkulkarni Feb 14, 2021
b45ae76
Revert "Unhandled exception handler based on local ref counting (#140…
rkooo567 Feb 15, 2021
82539f2
Export additional metrics to Prometheus (#14061)
kathryn-zhou Feb 15, 2021
b8b2d64
[docs] new Ray Cluster documentation (#13839)
jvrdnd Feb 15, 2021
bcb51a2
[Serve] [Doc] Add version warning (#14001)
architkulkarni Feb 15, 2021
4d727e4
[tune] enable more tests (#13969)
richardliaw Feb 15, 2021
0fb96a6
[Serve] Add support for variable routes (#13968)
architkulkarni Feb 15, 2021
496dd29
skip test_basic_reconstruction_actor_task on win (#14110)
architkulkarni Feb 15, 2021
4846a6c
Release process update (#13798)
Feb 15, 2021
e457872
Revert "Revert "Unhandled exception handler based on local ref counti…
ericl Feb 15, 2021
4ad79ca
[Object Spilling] Remove LRU eviction (#13977)
rkooo567 Feb 15, 2021
5e76389
[serve] Don't overwrite self.handle in StarletteEndpoint (#14111)
edoakes Feb 15, 2021
ebb6e55
[tune] PB2 - add small constant (#14118)
jparkerholder Feb 16, 2021
da0c2c9
[autoscaler] Fix bad reference error when specifying IamInstanceProfi…
pdames Feb 16, 2021
350fb5b
[autoscaler] Remove Hardcoded 8265 (#14112)
ijrsvt Feb 16, 2021
e434ffe
[tune] Avoid crash in client mode when return results creating logdir…
ericl Feb 16, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .bazelrc
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ test:asan --test_env=ASAN_OPTIONS="detect_leaks=0"
test:asan --test_env=LD_PRELOAD="/usr/lib/x86_64-linux-gnu/libasan.so.2 /usr/lib/gcc/x86_64-linux-gnu/7/libasan.so"
# For example, for Ubuntu 18.04 libasan can be found here:
# test:asan --test_env=LD_PRELOAD="/usr/lib/gcc/x86_64-linux-gnu/7/libasan.so"
test:asan-buildkite --test_env=LD_PRELOAD="/usr/lib/x86_64-linux-gnu/libasan.so.5"

# CI configuration:
aquery:ci --color=no
Expand Down
21 changes: 20 additions & 1 deletion .buildkite/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,33 @@ FROM ubuntu:focal

ARG REMOTE_CACHE_URL
ARG BUILDKITE_PULL_REQUEST
ARG BUILDKITE_COMMIT
ARG BUILDKITE_PULL_REQUEST_BASE_BRANCH

ENV DEBIAN_FRONTEND=noninteractive
ENV TZ=America/Los_Angeles

ENV BUILDKITE=true
ENV CI=true
ENV PYTHON=3.6
ENV RAY_USE_RANDOM_PORTS=1
ENV RAY_DEFAULT_BUILD=1
ENV BUILDKITE_PULL_REQUEST=${BUILDKITE_PULL_REQUEST}
ENV BUILDKITE_COMMIT=${BUILDKITE_COMMIT}
ENV BUILDKITE_PULL_REQUEST_BASE_BRANCH=${BUILDKITE_PULL_REQUEST_BASE_BRANCH}

RUN apt-get update -qq
RUN apt-get install -y -qq \
curl python-is-python3 git build-essential \
sudo unzip apt-utils dialog tzdata wget
sudo unzip apt-utils dialog tzdata wget rsync \
language-pack-en tmux cmake gdb vim htop \
libgtk2.0-dev zlib1g-dev libgl1-mesa-dev

# System conf for tests
RUN locale -a
ENV LC_ALL=en_US.utf8
ENV LANG=en_US.utf8
RUN echo "ulimit -c 0" >> /root/.bashrc

# Setup Bazel caches
RUN (echo "build --remote_cache=${REMOTE_CACHE_URL}" >> /root/.bazelrc); \
Expand All @@ -27,3 +42,7 @@ WORKDIR /ray
COPY . .
RUN ./ci/travis/ci.sh init
RUN bash --login -i ./ci/travis/ci.sh build

# Run determine test to run
RUN bash --login -i -c "python ./ci/travis/determine_tests_to_run.py --output=json > affected_set.json"
RUN cat affected_set.json
186 changes: 182 additions & 4 deletions .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,184 @@
- label: "Ray Core Tests (:buildkite: Experimental)"
- label: ":book: Lint"
commands:
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only -- //:all -rllib/...
- label: "Ray Dashboard Tests"
- export LINT=1
- ./ci/travis/install-dependencies.sh
- ./ci/travis/ci.sh lint
- ./ci/travis/ci.sh build

- label: ":java: Java"
conditions: ["RAY_CI_JAVA_AFFECTED"]
commands:
- bazel test --config=ci $(./scripts/bazel_export_options) python/ray/new_dashboard/...
- apt-get install -y openjdk-8-jdk maven clang-format
# Compile Java again so bazel will compile Java as a language.
- RAY_INSTALL_JAVA=1 ./ci/travis/ci.sh build
- ./java/test.sh

- label: ":java: Streaming"
conditions:
["RAY_CI_STREAMING_PYTHON_AFFECTED", "RAY_CI_STREAMING_JAVA_AFFECTED"]
commands:
- apt-get install -y openjdk-8-jdk maven
# Compile Java again so bazel will compile Java as a language.
- RAY_INSTALL_JAVA=1 ./ci/travis/ci.sh build
- bazel test --config=ci $(./scripts/bazel_export_options)
//streaming:all
- bash streaming/src/test/run_streaming_queue_test.sh

- label: ":cpp: Worker"
commands:
- ./ci/travis/ci.sh test_cpp

- label: ":cpp: Tests"
commands:
- bazel test --config=ci $(./scripts/bazel_export_options)
--build_tests_only
-- //:all -rllib/... -core_worker_test

- label: ":cpp: Tests (ASAN)"
commands:
- bazel test --config=ci --config=asan $(./scripts/bazel_export_options)
--build_tests_only
--config=asan-buildkite
--jobs=2
-- //:all -//:core_worker_test

- label: ":serverless: Dashboard + Serve Tests"
conditions:
[
"RAY_CI_SERVE_AFFECTED",
"RAY_CI_DASHBOARD_AFFECTED",
"RAY_CI_PYTHON_AFFECTED",
]
commands:
- TORCH_VERSION=1.6 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options)
python/ray/new_dashboard/...
- bazel test --config=ci $(./scripts/bazel_export_options)
python/ray/serve/...

- label: ":python: (Small & Large)"
conditions: ["RAY_CI_PYTHON_AFFECTED"]
commands:
- bazel test --config=ci $(./scripts/bazel_export_options)
--test_tag_filters=-kubernetes,-jenkins_only,-medium_size_python_tests_a_to_j,-medium_size_python_tests_k_to_z
python/ray/tests/...
- bazel test --config=ci $(./scripts/bazel_export_options)
--test_tag_filters=-kubernetes,-jenkins_only,client_tests
--test_env=RAY_CLIENT_MODE=1
python/ray/tests/...
- label: ":python: (Medium A-J)"
conditions: ["RAY_CI_PYTHON_AFFECTED"]
commands:
- bazel test --config=ci $(./scripts/bazel_export_options)
--test_tag_filters=-kubernetes,-jenkins_only,medium_size_python_tests_a_to_j
python/ray/tests/...
- label: ":python: (Medium K-Z)"
conditions: ["RAY_CI_PYTHON_AFFECTED"]
commands:
- bazel test --config=ci $(./scripts/bazel_export_options)
--test_tag_filters=-kubernetes,-jenkins_only,medium_size_python_tests_k_to_z
python/ray/tests/...

- label: ":brain: RLlib: Learning tests (from rllib/tuned_examples/*.yaml)"
conditions: ["RAY_CI_RLLIB_AFFECTED"]
commands:
- RLLIB_TESTING=1 TF_VERSION=2.1.0 TFP_VERSION=0.8 TORCH_VERSION=1.6 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options)
--build_tests_only
--test_tag_filters=learning_tests_tf
rllib/...
- label: ":brain: RLlib: Learning tests with tf=1.x (from rllib/tuned_examples/*.yaml)"
conditions: ["RAY_CI_RLLIB_AFFECTED"]
commands:
- RLLIB_TESTING=1 TF_VERSION=1.14.0 TFP_VERSION=0.7 TORCH_VERSION=1.6 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options)
--build_tests_only
--test_tag_filters=learning_tests_tf
rllib/...
- label: ":brain: RLlib: Learning tests with Torch (from rllib/tuned_examples/*.yaml)"
conditions: ["RAY_CI_RLLIB_AFFECTED"]
commands:
- RLLIB_TESTING=1 TF_VERSION=2.1.0 TFP_VERSION=0.8 TORCH_VERSION=1.6 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options)
--build_tests_only
--test_tag_filters=learning_tests_torch
rllib/...
- label: ":brain: RLlib: Quick Agent train.py runs"
conditions: ["RAY_CI_RLLIB_AFFECTED"]
commands:
- RLLIB_TESTING=1 TF_VERSION=2.1.0 TFP_VERSION=0.8 TORCH_VERSION=1.6 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options)
--build_tests_only
--test_tag_filters=quick_train
--test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1
rllib/...
# Test everything that does not have any of the "main" labels:
# "learning_tests|quick_train|examples|tests_dir".
- bazel test --config=ci $(./scripts/bazel_export_options)
--build_tests_only
--test_tag_filters=-learning_tests_tf,-learning_tests_torch,-quick_train,-examples,-tests_dir
--test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1
rllib/...
- label: ":brain: RLlib: rllib/examples/"
conditions: ["RAY_CI_RLLIB_AFFECTED"]
commands:
- RLLIB_TESTING=1 TF_VERSION=2.1.0 TFP_VERSION=0.8 TORCH_VERSION=1.6 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only
--test_tag_filters=examples_A,examples_B --test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1 rllib/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only
--test_tag_filters=examples_C,examples_D --test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1 rllib/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only
--test_tag_filters=examples_E,examples_F,examples_G,examples_H,examples_I,examples_J,examples_K,examples_L,examples_M,examples_N,examples_O,examples_P --test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1
rllib/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only
--test_tag_filters=examples_Q,examples_R,examples_S,examples_T,examples_U,examples_V,examples_W,examples_X,examples_Y,examples_Z --test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1
rllib/...
- label: ":brain: RLlib: rllib/tests/ (A-L)"
conditions: ["RAY_CI_RLLIB_AFFECTED"]
commands:
- RLLIB_TESTING=1 TF_VERSION=2.1.0 TFP_VERSION=0.8 TORCH_VERSION=1.6 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only
--test_tag_filters=tests_dir_A,tests_dir_B,tests_dir_C,tests_dir_D,tests_dir_E,tests_dir_F,tests_dir_G,tests_dir_H,tests_dir_I,tests_dir_J,tests_dir_K,tests_dir_L --test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1
rllib/...
- label: ":brain: RLlib: rllib/tests/ (M-Z)"
conditions: ["RAY_CI_RLLIB_AFFECTED"]
commands:
- RLLIB_TESTING=1 TF_VERSION=2.1.0 TFP_VERSION=0.8 TORCH_VERSION=1.6 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only
--test_tag_filters=tests_dir_M,tests_dir_N,tests_dir_O,tests_dir_P,tests_dir_Q,tests_dir_R,tests_dir_S,tests_dir_T,tests_dir_U,tests_dir_V,tests_dir_W,tests_dir_X,tests_dir_Y,tests_dir_Z --test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1
rllib/...

- label: ":octopus: Tune tests and examples"
conditions: ["RAY_CI_TUNE_AFFECTED"]
commands:
- TUNE_TESTING=1 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options) --test_tag_filters=-jenkins_only,-example python/ray/tune/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=example,-tf,-pytorch,-py37,-flaky python/ray/tune/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=tf,-pytorch,-py37,-flaky python/ray/tune/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=-tf,pytorch,-py37,-flaky python/ray/tune/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=-py37,flaky python/ray/tune/...

- label: ":octopus: SGD tests and examples"
conditions: ["RAY_CI_SGD_AFFECTED"]
commands:
- SGD_TESTING=1 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=tf,-pytorch,-py37 python/ray/util/sgd/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=-tf,pytorch,-py37 python/ray/util/sgd/...

- label: ":octopus: Tune/SGD tests and examples. Python 3.7"
conditions: ["RAY_CI_TUNE_AFFECTED", "RAY_CI_SGD_AFFECTED"]
commands:
- TUNE_TESTING=1 PYTHON=3.7 INSTALL_HOROVOD=1 ./ci/travis/install-dependencies.sh
# Bcause Python version changed, we need to re-install Ray here
- rm -rf ./python/ray/thirdparty_files; ./ci/travis/ci.sh build
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=py37 python/ray/tune/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only python/ray/util/xgboost/...

- label: ":book: Doc tests and examples"
conditions:
["RAY_CI_PYTHON_AFFECTED", "RAY_CI_TUNE_AFFECTED", "RAY_CI_DOC_AFFECTED"]
commands:
- DOC_TESTING=1 ./ci/travis/install-dependencies.sh
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=-tf,-pytorch,-py37 doc/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=tf,-pytorch,-py37 doc/...
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=-tf,pytorch,-py37 doc/...
54 changes: 44 additions & 10 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,9 @@ matrix:
- . ./ci/travis/ci.sh build
script:
# Run all C++ unit tests with ASAN enabled. ASAN adds too much overhead to run Python tests.
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only -- //:all
# NOTE: core_worker_test is out-of-date and should already covered by
# Python tests.
- bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only -- //:all -core_worker_test

- os: osx
osx_image: xcode7
Expand Down Expand Up @@ -195,6 +197,7 @@ matrix:
env:
# - PYTHON=3.6
- LINUX_WHEELS=1 LINUX_JARS=1
- DOCKER_BUILD_PY37=1
- PYTHONWARNINGS=ignore
- RAY_INSTALL_JAVA=1
language: java
Expand All @@ -207,10 +210,32 @@ matrix:
- . ./ci/travis/ci.sh test_wheels
- export PATH="$HOME/miniconda3/bin:$PATH"
- python -m pip install docker
- if [[ "$TRAVIS_PULL_REQUEST" != "false" ]]; then python $TRAVIS_BUILD_DIR/ci/travis/build-docker-images.py; fi
- if [[ "$TRAVIS_PULL_REQUEST" != "false" ]]; then python $TRAVIS_BUILD_DIR/ci/travis/build-docker-images.py PY37; fi
- bash ./java/build-jar-multiplatform.sh linux
cache: false


# Build Py36 & Py38 Docker Images
- os: linux
env:
- LINUX_WHEELS=1
- DOCKER_BUILD_PY36_38=1
- PYTHONWARNINGS=ignore
language: java
jdk: openjdk8
install:
- . ./ci/travis/ci.sh init RAY_CI_LINUX_WHEELS_AFFECTED
before_script:
- . ./ci/travis/ci.sh build
script:
- wget --quiet "https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh" -O miniconda3.sh
- bash miniconda3.sh -b -p "$HOME/miniconda3"
- export PATH="$HOME/miniconda3/bin:$PATH"
- conda install -y python=3.7.6
- python -m pip install docker
- if [[ "$TRAVIS_PULL_REQUEST" != "false" ]]; then python $TRAVIS_BUILD_DIR/ci/travis/build-docker-images.py PY36_PY38; fi
cache: false

# Build and deploy multi-platform jars.
- os: linux
env:
Expand Down Expand Up @@ -418,6 +443,7 @@ matrix:
script:
- ./ci/keep_alive bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=py37 python/ray/tune/...
- ./ci/keep_alive bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only python/ray/util/xgboost/...
- ./ci/keep_alive bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only python/ray/util/lightning_accelerators/...
# There are no python 3.7 tests for RaySGD at the moment
# - ./ci/keep_alive bazel test --config=ci --build_tests_only --test_tag_filters=py37 python/ray/util/sgd/...
# - ./ci/keep_alive bazel test --config=ci --build_tests_only --test_tag_filters=py37 doc/...
Expand All @@ -435,11 +461,10 @@ matrix:
script:
- . ./ci/travis/ci.sh test_cpp
script:
# raylet integration tests (core_worker_tests included in bazel tests below)
- ./ci/suppress_output bash src/ray/test/run_object_manager_tests.sh

# cc bazel tests (w/o RLlib)
- ./ci/suppress_output bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only -- //:all -rllib/...
# NOTE: core_worker_test is out-of-date and should already covered by Python
# tests.
- ./ci/suppress_output bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only -- //:all -rllib/... -core_worker_test

# ray serve tests
- if [ $RAY_CI_SERVE_AFFECTED == "1" ]; then ./ci/keep_alive bazel test --config=ci $(./scripts/bazel_export_options) --test_tag_filters=-jenkins_only python/ray/serve/...; fi
Expand Down Expand Up @@ -469,7 +494,7 @@ deploy:
on:
repo: ray-project/ray
all_branches: true
condition: $LINUX_WHEELS = 1 || $MAC_WHEELS = 1
condition: ($LINUX_WHEELS = 1 && $DOCKER_BUILD_PY37=1) || $MAC_WHEELS = 1

- provider: s3
edge: true # This supposedly opts in to deploy v2.
Expand All @@ -485,16 +510,16 @@ deploy:
on:
branch: master
repo: ray-project/ray
condition: $LINUX_WHEELS = 1 || $MAC_WHEELS = 1
condition: ($LINUX_WHEELS = 1 && $DOCKER_BUILD_PY37=1) || $MAC_WHEELS = 1

- provider: script
edge: true # This supposedly opts in to deploy v2.
script: export PATH="$HOME/miniconda3/bin:$PATH"; ./ci/keep_alive python $TRAVIS_BUILD_DIR/ci/travis/build-docker-images.py
script: export PATH="$HOME/miniconda3/bin:$PATH"; ./ci/keep_alive python $TRAVIS_BUILD_DIR/ci/travis/build-docker-images.py PY37
skip_cleanup: true
on:
repo: ray-project/ray
all_branches: true
condition: $LINUX_WHEELS = 1
condition: $LINUX_WHEELS = 1 && $DOCKER_BUILD_PY37 = 1

# Upload jars so that we can debug locally for every commit
- provider: s3
Expand Down Expand Up @@ -528,3 +553,12 @@ deploy:
repo: ray-project/ray
branch: master
condition: $MULTIPLATFORM_JARS = 1 || $MAC_JARS = 1 || $LINUX_JARS = 1

- provider: script
edge: true # This supposedly opts in to deploy v2.
script: export PATH="$HOME/miniconda3/bin:$PATH"; ./ci/keep_alive python $TRAVIS_BUILD_DIR/ci/travis/build-docker-images.py PY36_PY38
skip_cleanup: true
on:
repo: ray-project/ray
all_branches: true
condition: $LINUX_WHEELS = 1 && $DOCKER_BUILD_PY36_38 = 1
Loading