Skip to content

Commit dda69ba

Browse files
committed
update README.md and rand docstring
1 parent 05979bb commit dda69ba

3 files changed

Lines changed: 107 additions & 47 deletions

File tree

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
*.jl.cov
22
*.jl.*.cov
33
*.jl.mem
4+
.*~
5+
README.html

README.md

Lines changed: 95 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -7,40 +7,51 @@
77
This package explores a possible extension of `rand`-related
88
functionalities (from the `Random` module); the code is initially
99
taken from https://github.com/JuliaLang/julia/pull/24912.
10-
Note that type piracy is commited!
10+
Note that type piracy is committed!
1111
While hopefully useful, this package is still experimental, and
12-
hence unstable. Design or implementation contributions are welcome.
12+
hence unstable. User feedback, and design or implementation contributions are welcome.
1313

14-
This does mainly 3 things:
14+
This does essentially 4 things:
1515

1616
1) define distribution objects, to give first-class status to features
1717
provided by `Random`; for example `rand(Normal(), 3)` is equivalent
1818
to `randn(3)`; other available distributions: `Exponential`,
19-
`CloseOpen` (for generation of floats in a close-open range),
20-
`Uniform` (which can wrap an implicit uniform distribution),
21-
`make` (to combine distribution for objects made of multiple
22-
scalars, like `Pair`, `Tuple`, or `Complex`, or for containers);
19+
`CloseOpen` (for generation of floats in a close-open range) and friends,
20+
`Uniform` (which can wrap an implicit uniform distribution);
2321

24-
2) define generation of some containers filled with random values
22+
2) define `make` methods, which can combine distributions for objects made of multiple scalars, like
23+
`Pair`, `Tuple`, or `Complex`, or describe how to generate more complex objects, like containers;
24+
25+
3) extend the `rand([rng], [S], dims)` API to allow the generation of other containers than arrays
2526
(like `Set`, `Dict`, `SparseArray`, `String`, `BitArray`);
2627

27-
3) define a `Rand` iterator, which produces lazily random values.
28+
4) define a `Rand` iterator, which produces lazily random values.
2829

2930

3031
Point 1) defines a `Distribution` type which is incompatible with the
3132
"Distributions.jl" package. Input on how to unify the two approaches is
3233
welcome.
33-
Point 2) goes somewhat against the trend in `Base` to create
34-
containers using their constructors -- which by the way may be
35-
achieved with the `Rand` iterator from point 3).
36-
Still, I like the terser approach here, as it simply generalizes
37-
to other containers the __current__ `rand` API creating arrays.
38-
See the issue linked above for a discussion on those topics.
39-
40-
For convenience, the following objects from `Random` are re-exported
34+
35+
Point 2) is really the core of this package. `make` provides a vocabulary to define the generation
36+
of "scalars" which require more than one argument to be described, e.g. pairs from `1:3` to `Int`
37+
(`rand(make(Pair, 1:3, Int))`) or regular containers (e.g. `make(Array, 2, 3)`). The point of
38+
calling `make` rather than putting all the arguments in `rand` directly is simplicity and
39+
composability: the `make` call always occurs as the second argument to `rand` (or first if the RNG
40+
is omitted). For example, `rand(make(Array, 2, 3), 3)` creates an array of matrices.
41+
Of course, `make` is not necessary, in that the same can be achieved with an ad hoc `struct`,
42+
which in some cases is clearer (e.g. `Normal(m, s)` rather than something like `make(Float64, Val(:Normal), m, s)`).
43+
44+
Point 3) allows something like `rand(1:30, Set, 10)` to produce a `Set` of length `10` with values
45+
from `1:30`. The idea is that `rand([rng], [S], Cont, etc...)` should always be equivalent to
46+
`rand([rng], make(Cont, [S], etc...))`. This design goes somewhat against the trend in `Base` to create
47+
containers using their constructors -- which by the way may be achieved via the `Rand` iterator from
48+
point 4). Still, I like the terse approach here, as it simply generalizes to other containers the
49+
_current_ `rand` API creating arrays. See the issue linked above for a discussion on these topics.
50+
51+
For convenience, the following names from `Random` are re-exported
4152
in this package: `rand!`, `AbstractRNG`, `MersenneTwister`,
4253
`RandomDevice` (`rand` is in `Base`). Functions like `randn!` or
43-
`bitrand` are considered to be obsoleted by this package so are not
54+
`randstring` are considered to be obsoleted by this package so are not
4455
re-exported. It's still needed to import `Random` separately in order
4556
to use functions which don't extend the `rand` API, namely
4657
`randsubseq`, `shuffle`, `randperm`, `randcycle`, and their mutating
@@ -51,11 +62,14 @@ There is not much documentation for now: `rand`'s docstring is updated,
5162
and here are some examples:
5263

5364
```julia
54-
julia> rand(CloseOpen()) # like rand(Float64)
65+
julia> rand(CloseOpen(Float64)) # equivalent to rand(Float64)
5566
0.7678877639669386
5667

57-
julia> rand(CloseOpen(1.0, 10.0)) # generation in [1.0, 10.0)
58-
4.309057677479184
68+
julia> rand(CloseClose(1.0f0, 10)) # generation in [1.0f0, 10.0f0]
69+
6.62467f0
70+
71+
julia> rand(OpenOpen(2.0^52, 2.0^52+1)) == 2.0^52 # exactness not guaranteed for "unreasonable" values!
72+
true
5973

6074
julia> rand(Normal(0.0, 10.0)) # explicit μ and σ parameters
6175
-8.473790458128912
@@ -66,21 +80,24 @@ julia> rand(Uniform(1:3)) # equivalent to rand(1:3)
6680
julia> rand(make(Pair, 1:10, Normal())) # random Pair, where both members have distinct distributions
6781
5 => 0.674375
6882

69-
julia> rand(make(Pair{Number, Any}, 1:10, Normal())) # specify the Pair type
83+
julia> rand(make(Pair{Number,Any}, 1:10, Normal())) # specify the Pair type
7084
Pair{Number,Any}(1, -0.131617)
7185

7286
julia> rand(Pair{Float64,Int}) # equivalent to rand(make(Pair, Float64, Int))
7387
0.321676 => -4583276276690463733
7488

75-
julia> rand(make(Tuple, 1:10, Normal()))
76-
(9, 1.3407309364427373)
89+
julia> rand(make(Tuple, 1:10, UInt8, OpenClose()))
90+
(9, 0x6b, 0.34900083923775505)
7791

7892
julia> rand(Tuple{Float64,Int}) # equivalent to rand(make(Tuple, Float64, Int))
7993
(0.9830769470405203, -6048436354564488035)
8094

8195
julia> rand(make(NTuple{3}, 1:10)) # produces a 3-tuple with values from 1:10
8296
(5, 9, 6)
8397

98+
julia> rand(make(NTuple{N,UInt8} where N, 1:3, 5))
99+
(0x02, 0x03, 0x02, 0x03, 0x02)
100+
84101
julia> rand(make(NTuple{3}, make(Pair, 1:9, Bool))) # make calls can be nested
85102
(2 => false, 8 => true, 7 => false)
86103

@@ -96,39 +113,52 @@ julia> rand(Normal(ComplexF64)) # equivalent to randn(ComplexF64)
96113
julia> rand(Set, 3)
97114
Set([0.717172, 0.78481, 0.86901])
98115

99-
julia> rand(1:9, Set, 3)
116+
julia> rand!(ans, Exponential())
117+
Set([0.7935073925105659, 2.593684878770254, 1.629181233597078])
118+
119+
julia> rand(1:9, Set, 3) # if you try `rand(1:3, Set, 9)`, it will take a while ;-)
100120
Set([3, 5, 8])
101121

122+
julia> rand(Dict{String,Int8}, 2)
123+
Dict{String,Int8} with 3 entries:
124+
"vxybIbae" => 42
125+
"bO2fTwuq" => -13
126+
102127
julia> rand(make(Pair, 1:9, Normal()), Dict, 3)
103128
Dict{Int64,Float64} with 3 entries:
104129
9 => 0.916406
105130
3 => -2.44958
106131
8 => -0.703348
107132

108-
julia> rand(0.3, 9) # equivalent to sprand(9, 0.3)
133+
julia> rand(SparseVector, 0.3, 9) # equivalent to sprand(9, 0.3)
109134
9-element SparseVector{Float64,Int64} with 3 stored entries:
110135
[1] = 0.173858
111136
[6] = 0.568631
112137
[8] = 0.297207
113138

114-
julia> rand(Normal(), 0.3, 2, 3) # equivalent to sprandn(2, 3, 0.3)
139+
julia> rand(Normal(), SparseMatrixCSC, 0.3, 2, 3) # equivalent to sprandn(2, 3, 0.3)
115140
2×3 SparseMatrixCSC{Float64,Int64} with 2 stored entries:
116141
[2, 1] = 0.448981
117142
[1, 2] = 0.730103
118143

144+
# like for Array, sparse arrays enjoy to be special cased: `SparseVector` or `SparseMatrixCSC` can be omitted:
145+
146+
julia> rand(make(make(1:9, 0.3, 2, 3), 0.1, 4)) # possible, bug ugly output when non-empty :-/
147+
4-element SparseVector{SparseMatrixCSC{Int64,Int64},Int64} with 0 stored entries
148+
119149
julia> rand(String, 4) # equivalent to randstring(4)
120150
"5o75"
121151

122-
julia> rand("123", String, 4) # String considered as a container
152+
julia> rand("123", String, 4) # like above, String creation with the "container" syntax ...
123153
"2131"
124154

155+
julia> rand(make(String, 3, "123")) # ... which is as always equivalent to a call to make
156+
"211"
157+
125158
julia> rand(String, Set, 3) # String considered as a scalar
126159
Set(["0Dfqj6Yr", "ILngfcRz", "HT5IEyK3"])
127160

128-
julia> rand(make(String, 3, "123"))
129-
"211"
130-
131-
julia> rand(BitArray, 3) # equivalent to bitrand(3)
161+
julia> rand(BitArray, 3) # equivalent to, but unfortunately more verbose than, bitrand(3)
132162
3-element BitArray{1}:
133163
true
134164
true
@@ -150,8 +180,20 @@ julia> julia> rand(Bernoulli(0.2), BitVector, 10) # using the Bernoulli distribu
150180
julia> rand(1:3, NTuple{3}) # NTuple{3} considered as a container, equivalent to rand(make(NTuple{3}, 1:3))
151181
(3, 3, 1)
152182

183+
julia> rand(1:3, Tuple{Int,UInt8, BigFloat}) # works also with more general tuple types
184+
(3, 0x02, 2.0)
185+
186+
julia> RandomExtensions.random_staticarrays() # poor man's conditional modules!
187+
# ugly warning
188+
189+
julia> rand(make(MVector{2,AbstractString}, String), SMatrix{3, 2})
190+
3×2 SArray{Tuple{3,2},MArray{Tuple{2},AbstractString,1,2},2,6} with indices SOneTo(3)×SOneTo(2):
191+
["SzPKXHFk", "1eFXaUiM"] ["RJnHwhb7", "jqfLcY8a"]
192+
["FMTKcBY8", "eoYtNntD"] ["FzdD530L", "ux6sWGMU"]
193+
["fFJuUtJQ", "H2mAQrIV"] ["pt0OYFJw", "O0fCfjjR"]
194+
153195
julia> Set(Iterators.take(Rand(RandomDevice(), 1:10), 3)) # RNG defaults to Random.GLOBAL_RNG
154-
Set([9, 2, 6])
196+
Set([9, 2, 6]) # note that the set could end up with less than 3 elements if `Rand` generates duplicates
155197

156198
julia> collect(Iterators.take(Uniform(1:10), 3)) # distributions can be iterated over, using Random.GLOBAL_RNG implicitly
157199
3-element Array{Int64,1}:
@@ -160,10 +202,10 @@ julia> collect(Iterators.take(Uniform(1:10), 3)) # distributions can be iterated
160202
5
161203
```
162204

163-
In some cases, the `Rand` iterator can provide some efficiency gains compared to
164-
repeated calls to `rand`, as it uses the same mechanism as non-scalar generation.
165-
For example, given `a = zeros(10000)`,
166-
`a .+ Rand(1:1000).()` will be faster than `a .+ rand.(Ref(1:1000))`.
205+
In some cases, the `Rand` iterator can provide efficiency gains compared to
206+
repeated calls to `rand`, as it uses the same mechanism as array generation.
207+
For example, given `a = zeros(1000)` and `s = BitSet(1:1000)`,
208+
`a .+ Rand(s).()` is three times faster than `a .+ rand.(Ref(s))`.
167209

168210
Note: as seen in the examples above, `String` can be considered as a scalar or as a container (in the `rand` API).
169211
In a call like `rand(String)`, both APIs coincide, but in `rand(String, 3)`, should we construct a `String` of
@@ -173,3 +215,19 @@ most useful (and offers the tersest API to compete with `randstring`).
173215
But as this package is still unstable, this choice may be revisited in the future.
174216
Note that it's easy to get the result of the second interpretation via either `rand(make(String), 3)`,
175217
`rand(String, (3,))` or `rand(String, Vector, 3)`.
218+
219+
How to extend: the `make` function is meant to be extensible, and there are some helper functions
220+
which make it easy, but the internals are not fully settled. By default, `make(T, args...)` will
221+
create a `Make{find_type(T, args...)}` object, say `m`, which contain `args...` as fields. For type
222+
stable code, the `rand` machinery likes to know the exact type of the object which will be generated by
223+
`rand(m)`, and `find_type(T, args...)` is supposed to return that type. For example,
224+
`find_type(Pair, 1:3, UInt) == Pair{Int,UInt}`.
225+
Then just define `rand` for `m` like documented in the `Random` module, e.g.
226+
`rand(rng::AbstractRNG, sp::SamplerTrivial{<:Make{P}}) where {P<:Pair} = P(rand(sp[].x), rand(sp[].y))`.
227+
228+
This package started out of frustration with the limitations of the `Random` module. Besides
229+
generating simple scalars and arrays, very little is supported out of the box. For example,
230+
generating a random `Dict` is too complex. Moreover, there are too many functions for my taste:
231+
`rand`, `randn`, `randexp`, `sprand` (with its exotic `rfn` parameter), `sprandn`, ~~`sprandexp`~~,
232+
`randstring`, `bitrand`, and mutating counterparts (but I believe `randn` will never go away, as
233+
it's so terse). I hope that this package can serve as a starting point towards improving `Random`.

src/RandomExtensions.jl

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -63,18 +63,18 @@ Pick a random element or collection of random elements from the set of values sp
6363
6464
`S` usually defaults to [`Float64`](@ref).
6565
66-
If `C...` is not specified, `rand` produces a scalar. Otherwise, `C...` can be:
66+
If `C...` is not specified, `rand` produces a scalar. Otherwise, `C` can be:
6767
6868
* a set of integers, or a tuple of `Int`, which specify the dimensions of an `Array` to generate;
69-
* `(Array, dims...)`: same as above, but with `Array` specified explicitely
70-
* `(p::AbstractFloat, m::Integer, [n::Integer])...`, which produces a sparse array of dimensions `(m, n)`,
71-
in which the probability of any element being nonzero is independently given by `p`
72-
* `(String, [n=8])...`, which produces a random `String` of length `n`; the generated string consists of `Char`
73-
taken from a predefined set like `randstring`, and can be specified with the `S` parameter.
74-
* `(Dict, n)...`, which produces a `Dict` of length `n`; `S` must then specify the type of its elements,
75-
e.g. `make(Pair, Int, 2:3)`;
76-
* `(Set, n)...`, which produces a `Set` of length `n`;
77-
* `(BitArray, dims...)...`, which produces a `BitArray` with the specified dimensions.
69+
* `(Array, dims...)`: same as above, but with `Array` specified explicitely;
70+
* `(p::AbstractFloat, m::Integer, [n::Integer])`, which produces a sparse array of dimensions `(m, n)`,
71+
in which the probability of any element being nonzero is independently given by `p`;
72+
* `(String, [n=8])`, which produces a random `String` of length `n`; the generated string consists of `Char`
73+
taken from a predefined set like `randstring`, and can be specified with the `S` parameter;
74+
* `(Dict, n)`, which produces a `Dict` of length `n`; if `Dict` is given without type parameters,
75+
then `S` must be specified;
76+
* `(Set, n)` or `(BitSet, n)`, which produces a set of length `n`;
77+
* `(BitArray, dims...)`, which produces a `BitArray` with the specified dimensions.
7878
7979
For `Array`, `Dict` and `Set`, a less abstract type can be specified, e.g. `Set{Float64}`, to force
8080
the type of the result regardless of the `S` parameter. In particular, in the absence of `S`, the

0 commit comments

Comments
 (0)