JRuby: A Ruby VM in Java
Charles Nutter
Who am I?
- headius@headius.com
- Senior Architect/Technologist at Ventera Corp
- Open Source
- LiteStep
- JRuby Developer
Part 1: Past and Present
What is JRuby?
- A “100% Java”-based Ruby Interpreter
- Mostly 1.8 compatible
- Four years and 15 developers; currently 3-5 active and under heavy development
- Originally modeled in Ruby 1.6 code
- Tri-licensed: CPL, GPL, LGPL
- Sun J2SE 1.4 or higher (Free / OSS coming soon)
- Java/Ruby integration getting better
Why JRuby?
- JVM provides native threading, geneational GC, and extensive networking and database support
- Java is very fast these days and most code gets JIT’d
- Wealth of libraries and frameworks; large userbase, wide deployment
- Many Javaists would like to use Ruby more
- Java is “just another platform” for Ruby
- JRuby could help grow Ruby the language apart from C Ruby
- could help define language apart from the platform it’s running on
- Sun, others very interested in dynamic (typed) languages on the JVM
- Javaists (by choice or by force) can help promote Ruby
- Java/Ruby integration merges best of both
- Use Java for its heavyweight advantages (library support, JMS, etc.), and Ruby for business logic
- Ruby + J2EE = enterprise Ruby that managers can swallow
- Ruby + J2ME…someday?
Demo 1: Java Integration
- Ruby code mixed into Java class’s proxy
- Enumerable, list behavior added to java.sql.ResultSet
- JDBC used for DB access (PostgreSQL 8.0)
- Mostly transparent object marshalling
Peers
- Jython
- Pythonists dig it
- By far the most popular JVM dyn lang
- Established, stable, feature-complete
- Helping to formally define/distinguish Python the language from Python the interpreter
- Interpreted or compiled (runs Python bytecode, or compiles to Java)
- Widely used
- Groovy
- Ruby-like syntax, some features from Nice
- Seamless Java integration
- First dyn language JSR
- Lots of momentum
- Interpreted (JIT) or compiled offline to Java code
- SISC
- Jruby redesign follows similar patterns
- Many others
The Real World
- RDT: A Ruby IDE for Eclipse
- jEdit: A Multi-language Code Editor
- DataVision: Java-based Reporting Software
- Internal projects
- Need more
How “Ruby” is it?
- Of 1049 Rubyicon tests, 80% succeed
- Temporary incompatibilities
- Ruby thread semantics differe from Java’s
- No continuations
- Twice as slow (half as fast?) as C Ruby or worse
- YAML: no up-to-date, working pure Ruby or Java parsers
- Still missing a few 1.8 methods
- Permanent incompatibilities
- System calls, C-language Ruby extensions, anything to do with C
- Platform-specifics: file stats, permissions, process launching, signals, …
Part Two: The Future
- Continuing to improve compatibility
- Running mainstream Ruby apps
- Improving Java integration
- Speeding up
- The New JRuby
What needs to change?
- JRuby deficiencies (as of 0.8.2)
- Stack depth (~ fib(280))
- Threading and thread semantics
- Continuations support
- Speed
- Consistency, maintainability
- Compilation
- Better use of Java’s strengths
- Tigher integration between Java and Ruby
- Ruby deficienies (as of 1.8):
- Stack depth (~ fib(1325))
- Native threading
- Speed
- Compilation
The New JRuby
- Stackless; Continuation Passing Style (roughly)
- Iterative interpreter
- min threading model
- Compilation to Java bytecodes, offline and JIT
- Pluggable architecture
- Seamless, powerful Ruby/Java integration
- Behave in controlled environments
- FAST
Milestones and progress
- Stackless, iterative proof of concept (POC) (Sept 15, complete)
- Redesign, refactoring of POC (Oct)
- Reimpl of interpreter based on POC (Nov)
- Reimpl of built-in classes (Nov-Jan)
- Threading engine (Jan)
- Tri-call optimizations (Jan)
- Continuations (Jan)
- Compilation (Feb – Apr)
- Complete for JavaOne 2006
Demo 2: Fibonacci
- Recursive fib algorithm (contrived, I know)
- Jruby 0.8.2: shallow
- Ruby: deeper
- JRuby “stackless” POC: deepest
(Demo of doing fib 30000 in JRuby! Pretty cool.)
(Somewhat longer demo of 150000. Also cool.)
What Else?
- YARV bytecode execution
- MetaRuby’s “Ruby in Ruby” useful to JRuby
- drb proxy to RMI
- ActiveRecord JDBC connector
- WEBrick-mimicking servlets
- Other ideas?
Part Three: What now?
- Redesign is in full swing
- Heavy refactoring of JRuby core
- A better Ruby than ruby?
- Help Wanted!
- zlib implementation using Ruby-Java integration
- FIle locking using Java’s NIO (New I/O)
- Feature-complete YAML support
- Running mainstream Ruby apps, isolating and reporting errors
- Help with new design and with refactoring effort
- Tangibles
Q&A
- JRuby
- JRuby mailing lists on SF
- Charles Oliver Nutter: headius@headius.com
- Thanks to:
- Thomas Enebo: JRuby project manager
- Kelly Nawrocke: Jruby developer
- David Corbin: JRuby developer, RDT developert
- Special thanks to Jan Arne Petersen, original JRubyist
Questions
- ???: About YAML – parser written in C, have C to Java translators been tried?
- Charles: Might not produce code that would wire in nicely; focusing in pure Ruby implementation.
- David Black: What about things that it would be nice if they were different than they currently exist in C Ruby – for instance, similar behavior often goes through different code paths? Can you change those things? Will it make it less Ruby?
- Charles: Mainly taken perspective that we are following what Ruby does and following what Matz and company do. Having this other platform will point out inconsistencies; some things are unfollowable. Having two places where behavior is implemented shows inconsistencies.
- ???: Is JRuby going to be reentrant? Will you be able to run multiple JRuby instances in the same process?
- Charles: Yes, unable to control where calls are coming from, so needs to be re-entrant. Either that or able to run multiple lightweight interpreters in the same VM and then manage state. Not thread safe at this point but hopefully that will change.
- Duane Johnson: In the demo, the each iterator isn’t acting very Ruby-like.
- Charles: The demo is kind of put together to show everything. What would probably be better would be a Ruby-Java layer that does “rubyfication.” Code as demo’d was more javaish but still simpler than real Java.
YARV Progress Report
Koichi SASADA
Caution! (review)
- I can’t speak English well
- If I say strange English, you can see the slide page
- Or ask another Japanese. They can speak English well.
- If you have any queestions, ask me with:
- Japanese (recommended)
- Ruby, C, Scheme, Java, …
- IRC (#rubyconf on freenode)
- If I say strange English, you can see the slide page
Agenda
- Self Introduction and Japanese Activities
- Overview of YARV
- Goal of YARV
- Current YARV status
- YARV Design, Optimization Review
- Evaluation
- Conclusion
Self Introduction
- “SASADA” the family name
- “Koichi” is given name → “ko1”
- A Student for Ph.D. 2nd grade (Not a Son-shi)
- Systems Software for Multithreaded Arch
- SMT/CMP or other tech
- i.e.: Hypter threeading (Intel), CMT (Sun), Power (IBM)
- OS, Library, Compiler and Interpreter
- YARV is my first step for parallel interpreter
- Computer Architecture for Next Generation at Public Position
- Systems Software for Multithreaded Arch
- Nihon Ruby no Kai
- Organized by Mr. Takahashi (maki)
- Rubyist Magazine
- vol 10 at 10 Oct 2005
- 1st anniversary at 6 Sep 2005 (vol 9)
- Ruby-dev summary
- English Diary some days
- But retired
Our Activity: Rubyist Magazine
- Many Japanese articles related to Ruby
- Cooperate with Ruby Code & Style?
- I’m writing YARV internal named “YARV Maniacs”
- Many interviews of Japanese Rubyists
RubyMa!
- Published 1 Apr 2005 (April Fools)
- Joke web-zine
- Parody of Negima!
- Many useful articles
- The Takahashi method:
def Takahashi
end
Overview of YARV
Overview: Background
- Ruby is used world-wide, ? the most comfortable programming languages
- Ruby is slow, because interpreter doesn’t use Virtual Machine Technology
- We need Ruby VM!
- YARV: Yet Another Ruby VM
- Started development on 1 Jan 2004
- At that time, there were some VMS for Ruby
- Started development on 1 Jan 2004
- Ruby’s license, of course
Overview: FAQ (review of last year FAQ)
- Q: How is “YARV” pronounced?
- A: You can pronounce “YARV” what you like.
- Q: Should I remember the name “YARV”?
- A: No. If YARV succeeds, it gets renamed to Rite, if it doesn’t, no one will remember it
- About YARV, name is NOT ???
Overview: YARV System
Overview: Current Interpreter
- Ruby Program: a = b + c
- Syntax tree: (a =) → (method dispatch + (b), ©)
- Current interpreter traverses AST directly
Overview YARV – Stack Machine
The Goal of YARV
- YARV: Yet Another RubyVM → The RubyVM
- To be the Ruby 2.0 VM Rite
- Fastest Ruby Interpreter
- Easy to be the current Ruby interpreter
The Goal of YARV (cont.)
- Support all Ruby features
- Include Ruby 2.0 new syntaxes
- Native thread support
- Concurrent execution (Giant VM lock)
- Parallel execution on parallel machine
- Multi-VM instance
- Same as Mutlti-VM in Java
Goal: Ruby 2.0 syntax
- Matz will decide it :-)
- “{|…| …}” == “→(…){ … }”
- “I think this is ugly” — Ko1
- Multiple-values
- Same as Array? Or first class multiple-values support?
- Selector-namespace?
Goal: Native Thread Support
- Three different thread models
- Model 1: User-level thread (green)
- same as current Rubt interpreter
- Model 2: Native thread with giant VM lock
- Same as current Ruby interpreter
- Easy to implement
- Model 3: Native-thread with fine grain lock
- Run ruby threads in parallel
- For enterprise?
Goal: Native Thread Support (cont.)
Current Ruby Interpreter & Model 1
- CPU1: Thread 1 → Thread 2 → Thread 1
- CPU2: Idle……..
Model 2: Native thread with Giant VM Lock
- CPU1: Thread 1 → (Lock) → (in OS thread 2) Thread 2 → (Lock) → Thread 1
- CPU2: Idle……..
On this system, other threads can run (but the Ruby threads switch cpus with a lock)
Model 3: Native thread with Fine Grain Lock
- CPU1: Thread 1……
- CPU2: Thread 2……
Goal: Native Thread Support Summary
Model 1 | Model 2 | Model 3 | ||||
---|---|---|---|---|---|---|
Scalability | Bad | Bad? | Best |Lock overhead |
No | Some | High |
Impl. Difficulty | Norm | Easy | Hard | |||
Portability | Good | Bad | Bad |
Goal: Multi-VM Instance
- Current Ruby process: ( Process ( Ruby Interpreter (VM) ) )
- Ruby Process with Multi-VM Instance ( Process ((many) Ruby Interpreter (VM) ) )
- Current Ruby can hold only 1 interpreter in 1 process
- Interpreter structure causes this problem
- Using many global variables
- Multiple-VM instance
- Running some VM in 1 process
- It will help ruby embedded apps
- mod_ruby, etc.
Multi-VM Instance + Thread Model 2
CPU1: Thread 1 → (Lock of VM1) → Thread 2 → Lock of VM1
Goal: Load Map
- All Ruby features support
- Feb. 2006 … ?
- Native Thread Support
- Experimental: Dec. 2005
- Complete: 2006?(model 2) 2007?(model 3)
- Multi-VM support
- Experimental Feb 2006
- Complete: 2006?
Current Status of YARV
Status: System
Some almosts, an incomplete and a not yet
Status: Supported Ruby Features
- Almost all Ruby features
- Not supported:
- Few syntaxes …
{|*arg| ...}
- Visibility
- Safe level ($SAFE)
- Some methods written in C for current Ruby implementation
- Around Signal
- C extension libraries
- (Because YARV can’t run mkmf.rb)
- Few syntaxes …
Status: Versions
- 0.2 YARV as C Extension
- Need a patch to Ruby interpeter
- 0.3 (2005-8): YARV as Ruby Interpreter
- merged to Ruby source (1.9 HEAD)
- Maintained on my subversion repository
- Latest version: 0.2
- Native thread (pthread / Win32) supports model 2
YARV 0.2.x
(Ruby Interpreter (Evaluator)) → YARV (Compiler, VM, Optimizer) → back
YARV 0.3.x
- YARV marged with Ruby Interpreter
- Future work
- Generational GC
- m17n
- …
Status: Compile & Disasm CGI
Status: VM Design
- 5 registers
- PC: Program Counter
- SP: Stack Pointer
- CFP: Controler Frame Pointer
- LFP: LOcal frame pointer
- DFP: Dynamic Frame Pointer
- Some stack frame
- Control stack and value stack
Status: Optimization
- Simple Stack Virtual Machine
- Re-design Exception handling
- Peep-hole optimization on compile time
- I gave up static program analysis
- Dynamicity is your friend but my ENEMY!
- Direct Threaded code with GCC
- Specialized Instruction
- i.e. Ruby program “x+y” compiled to special instruction instead of a method dispatch instruction
- In-line Cache
- In-line Method Cache
- In-line constant value cache
- Because ruby’s “constant variable” is not constant!
- Embed values in an instruction sequence
- Unified Instruction
- Operands Unification
- Insn_A x → Insn_A_x
- Unified instructions are auto generated by VM gen
- I only decide which instructions should be combined
- Stack Caching
- JIT Compilation
- I made easy one for x86, but…
- Too hard to do alone. I retired.
- AOT Compilation
- YARV bytecode → C Source
- Easy to develop
- Hard to support exception
Status: Demo
- YARV building demo?
- YARV running demo?
Status: Evaluation
- Yield block is not fast (2-3 times faster than C Ruby) – optimizing this will be work for the future
- With all optimizations, basic math can see a 50 times performance gain over C Ruby
- Ackermann can see 20 times gain over C Ruby
- Wow – YARV as it stands stacks up really well against other interpreters for basic math type stuff
Status: Awards
- 2004: Funded by IPA Exploratory Software Development “Youth”
- IPA: Information-technology Promotion Agency, Japan
- 2005: Funded by IPA Exploratory Software Development (continuance)
- 2004: got awarded “Super creator” from IPA
Conclusion
- YARV supports almost all Ruby syntax
- YARV suppoorts some RUby libraries
- YARV 0.3.2 supports native thread
- YARV achieves significant speedup for ruby programs which have VM bottleneck
- This means that we can enjoy Symbol programming with Ruby
Conclusion: Future Work
- Support all Ruby features
- mkmf.rb
- Support every thread model
- especially 2 and 3
- Support multi-VM Instance
How Can You Help me
- Any comments are welcome
- Build reports, Bug reports, architecture reports, …
- yarv-devel Mailing List
- English ML for YARV development
- Matz and other Japanese join
- English ML for YARV development
- YARV Wiki
- Give me a job! (I’ll finish my course 2 years later)
Special Thanks
- Matz the architect of Ruby
- IPA: His sponsor
- YARV development ML subs
- All rubyists
Q&A
- All: We want the demo!
- ko1: OK
- Derek Sivers: A bunch of Japanese
- ko1: Some more Japanese
MetaRuby: Reimplementing Ruby in Ruby
Eric Hodel
Once upon a time…
- Eric and Ryan were hacking some Ruby related C
- And it sucked
MetaRuby
- Will implement Ruby in Ruby
- Core libarireis, parser, interpeter
MetaRuby Architecture
- Parser
- Interpeter
- GC
- …
Why?
- Writing Ruby internals in C requries mental context switch every time you change bwetween RUby and C
Example of C code vs Ruby code.
- More Familiar
- More approachable
- Less to do
- No NULL termination
- No tainting or freezing
Inspirationsal Projects
- Sqeak Smalltalk
- Self
- Pascal, Modula-2, Oberon by Wirth
- All of these are written in themselves
Related Projects
- Matju’s MetaRuby
- YARV
- JRuby
Matju’s MetaRuby
- Different goal much more complex
- Abstracted core classes
YARV
- Ruby interpreter replacement
Rubidium
- Ruby interpreter replacement
- Rubidium is an optimizing Ruby interpreter
Rubytests
- Unit tests for Ruby
- Not comprehensive enough for our goals
- Not much work making it more complex
JRuby
- A 1.8.2 compatible Ruby interpreter
- Most builtin Ruby classes provided
- Support for interfacing and defining Java classes in Ruby
- Uses Rubytests
Current Work
Methodology
- Generate a stubbed class to overlay
- Drive unit tests to failure
- Identify core methods (primitives) that have to exist
- Fix bad tests that pass despite no implementation
- Drive all tests to green
- Hack, hack, hack
Passing Tests
- TrueClass, FalseClass and NilClass
- Time
- Range
- NilClass
- Array
- String
Overlaid Classes
These classes overlay their core classes using Ruby’s C allocation and initialization methods replacing as many methods as possible
- TrueClass
- NilClass
- Array
- String
Replaced Classes
- Time
- Range
- Hash
Rubytests
- Stale
- Mostly tests Ruby 1.6 language features
- Low test coverage
- Not fully converted to Test::Unit
- Way too much code from pre-testunit
Test::Unit
- Needs lots of methods to work
- Too complicated to refactor
- Working on core classes is hard
Future Work
Primitives
- Will be automatically translated to C
- What is a primitive?
- Implement as much as possible in Ruby
- Whatever is left becomes a primitive
- Unless we can break it down
- Choosing primitives is a discovery process
Ruby2c Translation
- Ryan will cover this a lot more
- Only necessary for primitives
Memory Allocation (Objects)
- Currently Array and String sit on top of C Ruby
- Write object allocation in pure Ruby using current memory system for all objects
- Then we will replace the memory system with a pure Ruby system
Replace core ruby library
- Works!
- Well.. kind of..
- Compiles
- Links!
- Segfaults!
- Needs alot of ping pong
Far Future Work
The Groveling Commences
Parser
- Ripper is our best target
- Almost entirely Ruby already
- Just one file is in C, which we can rewrite
Object System & Garbage Collector
- Steal ideas from Sqeuak Smalltalk, Self, current Ruby
- In theory it should be easy to do
- In reality it will be hard to do well
- We’d love someone to work on this
Interpreter
- YARV or eval.c (Ruby 1.8)?
- Rubidium?
- Needs to we written in Ruby
- We’d love someone to work on this
C Extensions & C Standard Library
- Why are you writing pure C anyways?
- Use RubyInline or DL
- Probably need Ruby/C compatability stubs
- Easy to generate
- Will need to follow current Ruby/C naming conventions
Array#fill
Eight ways to call
- array.fill(obj)
- array.fill(obj, start[, length])
- array.fill(obj.range)
- array.fill {|index| block }
- array.fill(start…
“foo”.sub(/f(o)o/) { $1 }
- $1 is a “magick” read only global
- $1 can’t be set from pure Ruby
- So the interpreter needs to help us out
- Applies to all match variables
String#split
- Easy
- “a b”.split # => [‘a’, ‘b’]
- “a|b”.split # => [
- “a1b”.split(/*\d)/) # => [‘a’,‘1’,‘c’]
- Hard
Time.rb Needs Metal
- Easy
- the_time.month
- the_time.to_f
- etc
- Hard
- Time.now requries calling libc’s gettime method
- Currently we have libcwrap.rb that uses RubyInline to call into C funcitons