Ye Olde
Talbott Blog

RubyConf 2005 - Afternoon Day One

JRuby: A Ruby VM in Java

Charles Nutter

Who am I?

  • headius@headius.com
  • Senior Architect/Technologist at Ventera Corp
  • Open Source
    • LiteStep
    • JRuby Developer

Part 1: Past and Present

What is JRuby?

  • A “100% Java”-based Ruby Interpreter
  • Mostly 1.8 compatible
  • Four years and 15 developers; currently 3-5 active and under heavy development
  • Originally modeled in Ruby 1.6 code
  • Tri-licensed: CPL, GPL, LGPL
  • Sun J2SE 1.4 or higher (Free / OSS coming soon)
  • Java/Ruby integration getting better

Why JRuby?

  • JVM provides native threading, geneational GC, and extensive networking and database support
    • Java is very fast these days and most code gets JIT’d
  • Wealth of libraries and frameworks; large userbase, wide deployment
  • Many Javaists would like to use Ruby more
  • Java is “just another platform” for Ruby
  • JRuby could help grow Ruby the language apart from C Ruby
    • could help define language apart from the platform it’s running on
  • Sun, others very interested in dynamic (typed) languages on the JVM
  • Javaists (by choice or by force) can help promote Ruby
  • Java/Ruby integration merges best of both
    • Use Java for its heavyweight advantages (library support, JMS, etc.), and Ruby for business logic
  • Ruby + J2EE = enterprise Ruby that managers can swallow
  • Ruby + J2ME…someday?

Demo 1: Java Integration

  • Ruby code mixed into Java class’s proxy
    • Enumerable, list behavior added to java.sql.ResultSet
  • JDBC used for DB access (PostgreSQL 8.0)
  • Mostly transparent object marshalling

Peers

  • Jython
    • Pythonists dig it
    • By far the most popular JVM dyn lang
    • Established, stable, feature-complete
    • Helping to formally define/distinguish Python the language from Python the interpreter
    • Interpreted or compiled (runs Python bytecode, or compiles to Java)
    • Widely used
  • Groovy
    • Ruby-like syntax, some features from Nice
    • Seamless Java integration
    • First dyn language JSR
    • Lots of momentum
    • Interpreted (JIT) or compiled offline to Java code
  • SISC
    • Jruby redesign follows similar patterns
  • Many others

The Real World

  • RDT: A Ruby IDE for Eclipse
  • jEdit: A Multi-language Code Editor
  • DataVision: Java-based Reporting Software
  • Internal projects
  • Need more

How “Ruby” is it?

  • Of 1049 Rubyicon tests, 80% succeed
  • Temporary incompatibilities
    • Ruby thread semantics differe from Java’s
    • No continuations
    • Twice as slow (half as fast?) as C Ruby or worse
    • YAML: no up-to-date, working pure Ruby or Java parsers
    • Still missing a few 1.8 methods
  • Permanent incompatibilities
    • System calls, C-language Ruby extensions, anything to do with C
    • Platform-specifics: file stats, permissions, process launching, signals, …

Part Two: The Future

  • Continuing to improve compatibility
  • Running mainstream Ruby apps
  • Improving Java integration
  • Speeding up
  • The New JRuby

What needs to change?

  • JRuby deficiencies (as of 0.8.2)
    • Stack depth (~ fib(280))
    • Threading and thread semantics
    • Continuations support
    • Speed
    • Consistency, maintainability
    • Compilation
    • Better use of Java’s strengths
    • Tigher integration between Java and Ruby
  • Ruby deficienies (as of 1.8):
    • Stack depth (~ fib(1325))
    • Native threading
    • Speed
    • Compilation

The New JRuby

  • Stackless; Continuation Passing Style (roughly)
  • Iterative interpreter
  • min threading model
  • Compilation to Java bytecodes, offline and JIT
  • Pluggable architecture
  • Seamless, powerful Ruby/Java integration
  • Behave in controlled environments
  • FAST

Milestones and progress

  • Stackless, iterative proof of concept (POC) (Sept 15, complete)
  • Redesign, refactoring of POC (Oct)
  • Reimpl of interpreter based on POC (Nov)
  • Reimpl of built-in classes (Nov-Jan)
  • Threading engine (Jan)
  • Tri-call optimizations (Jan)
  • Continuations (Jan)
  • Compilation (Feb – Apr)
  • Complete for JavaOne 2006

Demo 2: Fibonacci

  • Recursive fib algorithm (contrived, I know)
  • Jruby 0.8.2: shallow
  • Ruby: deeper
  • JRuby “stackless” POC: deepest

(Demo of doing fib 30000 in JRuby! Pretty cool.)
(Somewhat longer demo of 150000. Also cool.)

What Else?

  • YARV bytecode execution
  • MetaRuby’s “Ruby in Ruby” useful to JRuby
  • drb proxy to RMI
  • ActiveRecord JDBC connector
  • WEBrick-mimicking servlets
  • Other ideas?

Part Three: What now?

  • Redesign is in full swing
  • Heavy refactoring of JRuby core
  • A better Ruby than ruby?
  • Help Wanted!
    • zlib implementation using Ruby-Java integration
    • FIle locking using Java’s NIO (New I/O)
    • Feature-complete YAML support
    • Running mainstream Ruby apps, isolating and reporting errors
    • Help with new design and with refactoring effort
    • Tangibles

Q&A

  • JRuby
  • JRuby mailing lists on SF
  • Charles Oliver Nutter: headius@headius.com
  • Thanks to:
    • Thomas Enebo: JRuby project manager
    • Kelly Nawrocke: Jruby developer
    • David Corbin: JRuby developer, RDT developert
    • Special thanks to Jan Arne Petersen, original JRubyist

Questions

  • ???: About YAML – parser written in C, have C to Java translators been tried?
    • Charles: Might not produce code that would wire in nicely; focusing in pure Ruby implementation.
  • David Black: What about things that it would be nice if they were different than they currently exist in C Ruby – for instance, similar behavior often goes through different code paths? Can you change those things? Will it make it less Ruby?
    • Charles: Mainly taken perspective that we are following what Ruby does and following what Matz and company do. Having this other platform will point out inconsistencies; some things are unfollowable. Having two places where behavior is implemented shows inconsistencies.
  • ???: Is JRuby going to be reentrant? Will you be able to run multiple JRuby instances in the same process?
    • Charles: Yes, unable to control where calls are coming from, so needs to be re-entrant. Either that or able to run multiple lightweight interpreters in the same VM and then manage state. Not thread safe at this point but hopefully that will change.
  • Duane Johnson: In the demo, the each iterator isn’t acting very Ruby-like.
    • Charles: The demo is kind of put together to show everything. What would probably be better would be a Ruby-Java layer that does “rubyfication.” Code as demo’d was more javaish but still simpler than real Java.

YARV Progress Report

Koichi SASADA

Caution! (review)

  • I can’t speak English well
    • If I say strange English, you can see the slide page
      • Or ask another Japanese. They can speak English well.
    • If you have any queestions, ask me with:
      • Japanese (recommended)
      • Ruby, C, Scheme, Java, …
      • IRC (#rubyconf on freenode)

Agenda

  • Self Introduction and Japanese Activities
  • Overview of YARV
  • Goal of YARV
  • Current YARV status
    • YARV Design, Optimization Review
    • Evaluation
  • Conclusion

Self Introduction

  • SASADA” the family name
  • “Koichi” is given name → “ko1”
  • A Student for Ph.D. 2nd grade (Not a Son-shi)
    • Systems Software for Multithreaded Arch
      • SMT/CMP or other tech
      • i.e.: Hypter threeading (Intel), CMT (Sun), Power (IBM)
      • OS, Library, Compiler and Interpreter
      • YARV is my first step for parallel interpreter
    • Computer Architecture for Next Generation at Public Position
  • Nihon Ruby no Kai
    • Organized by Mr. Takahashi (maki)
  • Rubyist Magazine
    • vol 10 at 10 Oct 2005
    • 1st anniversary at 6 Sep 2005 (vol 9)
  • Ruby-dev summary
  • English Diary some days
    • But retired

Our Activity: Rubyist Magazine

  • Many Japanese articles related to Ruby
    • Cooperate with Ruby Code & Style?
    • I’m writing YARV internal named “YARV Maniacs”
  • Many interviews of Japanese Rubyists

RubyMa!

  • Published 1 Apr 2005 (April Fools)
    • Joke web-zine
  • Parody of Negima!
  • Many useful articles
  • The Takahashi method:
    def Takahashi
    end

Overview of YARV

Overview: Background

  • Ruby is used world-wide, ? the most comfortable programming languages
  • Ruby is slow, because interpreter doesn’t use Virtual Machine Technology
  • We need Ruby VM!
  • YARV: Yet Another Ruby VM
    • Started development on 1 Jan 2004
      • At that time, there were some VMS for Ruby
  • Ruby’s license, of course

Overview: FAQ (review of last year FAQ)

  • Q: How is “YARV” pronounced?
  • A: You can pronounce “YARV” what you like.
  • Q: Should I remember the name “YARV”?
  • A: No. If YARV succeeds, it gets renamed to Rite, if it doesn’t, no one will remember it
    • About YARV, name is NOT ???

Overview: YARV System

Overview: Current Interpreter

  • Ruby Program: a = b + c
  • Syntax tree: (a =) → (method dispatch + (b), ©)
  • Current interpreter traverses AST directly

Overview YARV – Stack Machine

The Goal of YARV

  • YARV: Yet Another RubyVM → The RubyVM
    • To be the Ruby 2.0 VM Rite
  • Fastest Ruby Interpreter
    • Easy to be the current Ruby interpreter

The Goal of YARV (cont.)

  • Support all Ruby features
    • Include Ruby 2.0 new syntaxes
  • Native thread support
    • Concurrent execution (Giant VM lock)
    • Parallel execution on parallel machine
  • Multi-VM instance
    • Same as Mutlti-VM in Java

Goal: Ruby 2.0 syntax

  • Matz will decide it :-)
  • “{|…| …}” == “→(…){ … }”
    • “I think this is ugly” — Ko1
  • Multiple-values
    • Same as Array? Or first class multiple-values support?
  • Selector-namespace?

Goal: Native Thread Support

  • Three different thread models
  • Model 1: User-level thread (green)
    • same as current Rubt interpreter
  • Model 2: Native thread with giant VM lock
    • Same as current Ruby interpreter
    • Easy to implement
  • Model 3: Native-thread with fine grain lock
    • Run ruby threads in parallel
    • For enterprise?

Goal: Native Thread Support (cont.)

Current Ruby Interpreter & Model 1
  • CPU1: Thread 1 → Thread 2 → Thread 1
  • CPU2: Idle……..
Model 2: Native thread with Giant VM Lock
  • CPU1: Thread 1 → (Lock) → (in OS thread 2) Thread 2 → (Lock) → Thread 1
  • CPU2: Idle……..

On this system, other threads can run (but the Ruby threads switch cpus with a lock)

Model 3: Native thread with Fine Grain Lock
  • CPU1: Thread 1……
  • CPU2: Thread 2……

Goal: Native Thread Support Summary

Model 1 Model 2 Model 3
Scalability Bad Bad? Best
|Lock overhead
No Some High
Impl. Difficulty Norm Easy Hard
Portability Good Bad Bad

Goal: Multi-VM Instance

  • Current Ruby process: ( Process ( Ruby Interpreter (VM) ) )
  • Ruby Process with Multi-VM Instance ( Process ((many) Ruby Interpreter (VM) ) )
  • Current Ruby can hold only 1 interpreter in 1 process
    • Interpreter structure causes this problem
    • Using many global variables
  • Multiple-VM instance
    • Running some VM in 1 process
    • It will help ruby embedded apps
      • mod_ruby, etc.

Multi-VM Instance + Thread Model 2

CPU1: Thread 1 → (Lock of VM1) → Thread 2 → Lock of VM1

Goal: Load Map

  • All Ruby features support
    • Feb. 2006 … ?
  • Native Thread Support
    • Experimental: Dec. 2005
    • Complete: 2006?(model 2) 2007?(model 3)
  • Multi-VM support
    • Experimental Feb 2006
    • Complete: 2006?

Current Status of YARV

Status: System

Some almosts, an incomplete and a not yet

Status: Supported Ruby Features

  • Almost all Ruby features
  • Not supported:
    • Few syntaxes … {|*arg| ...}
    • Visibility
    • Safe level ($SAFE)
    • Some methods written in C for current Ruby implementation
    • Around Signal
    • C extension libraries
      • (Because YARV can’t run mkmf.rb)

Status: Versions

  • 0.2 YARV as C Extension
    • Need a patch to Ruby interpeter
  • 0.3 (2005-8): YARV as Ruby Interpreter
    • merged to Ruby source (1.9 HEAD)
    • Maintained on my subversion repository
  • Latest version: 0.2
    • Native thread (pthread / Win32) supports model 2

YARV 0.2.x

(Ruby Interpreter (Evaluator)) → YARV (Compiler, VM, Optimizer) → back

YARV 0.3.x

  • YARV marged with Ruby Interpreter
  • Future work
    • Generational GC
    • m17n

Status: Compile & Disasm CGI

Status: VM Design

  • 5 registers
    • PC: Program Counter
    • SP: Stack Pointer
    • CFP: Controler Frame Pointer
    • LFP: LOcal frame pointer
    • DFP: Dynamic Frame Pointer
  • Some stack frame
  • Control stack and value stack

Status: Optimization

  • Simple Stack Virtual Machine
    • Re-design Exception handling
  • Peep-hole optimization on compile time
    • I gave up static program analysis
    • Dynamicity is your friend but my ENEMY!
  • Direct Threaded code with GCC
  • Specialized Instruction
    • i.e. Ruby program “x+y” compiled to special instruction instead of a method dispatch instruction
  • In-line Cache
    • In-line Method Cache
    • In-line constant value cache
      • Because ruby’s “constant variable” is not constant!
  • Embed values in an instruction sequence
  • Unified Instruction
    • Operands Unification
    • Insn_A x → Insn_A_x
  • Unified instructions are auto generated by VM gen
    • I only decide which instructions should be combined
  • Stack Caching
  • JIT Compilation
    • I made easy one for x86, but…
    • Too hard to do alone. I retired.
  • AOT Compilation
    • YARV bytecode → C Source
    • Easy to develop
    • Hard to support exception

Status: Demo

  • YARV building demo?
  • YARV running demo?

Status: Evaluation

  • Yield block is not fast (2-3 times faster than C Ruby) – optimizing this will be work for the future
  • With all optimizations, basic math can see a 50 times performance gain over C Ruby
  • Ackermann can see 20 times gain over C Ruby
  • Wow – YARV as it stands stacks up really well against other interpreters for basic math type stuff

Status: Awards

  • 2004: Funded by IPA Exploratory Software Development “Youth”
    • IPA: Information-technology Promotion Agency, Japan
  • 2005: Funded by IPA Exploratory Software Development (continuance)
  • 2004: got awarded “Super creator” from IPA

Conclusion

  • YARV supports almost all Ruby syntax
  • YARV suppoorts some RUby libraries
  • YARV 0.3.2 supports native thread
  • YARV achieves significant speedup for ruby programs which have VM bottleneck
    • This means that we can enjoy Symbol programming with Ruby

Conclusion: Future Work

  • Support all Ruby features
    • mkmf.rb
  • Support every thread model
    • especially 2 and 3
  • Support multi-VM Instance

How Can You Help me

  • Any comments are welcome
    • Build reports, Bug reports, architecture reports, …
  • yarv-devel Mailing List
    • English ML for YARV development
      • Matz and other Japanese join
  • YARV Wiki
  • Give me a job! (I’ll finish my course 2 years later)

Special Thanks

  • Matz the architect of Ruby
  • IPA: His sponsor
  • YARV development ML subs
  • All rubyists

Q&A

  • All: We want the demo!
    • ko1: OK
  • Derek Sivers: A bunch of Japanese
    • ko1: Some more Japanese

MetaRuby: Reimplementing Ruby in Ruby

Eric Hodel

Once upon a time…

  • Eric and Ryan were hacking some Ruby related C
  • And it sucked

MetaRuby

  • Will implement Ruby in Ruby
    • Core libarireis, parser, interpeter

MetaRuby Architecture

  • Parser
  • Interpeter
  • GC

Why?

  • Writing Ruby internals in C requries mental context switch every time you change bwetween RUby and C

Example of C code vs Ruby code.

  • More Familiar
  • More approachable
  • Less to do
    • No NULL termination
    • No tainting or freezing

Inspirationsal Projects

  • Sqeak Smalltalk
  • Self
  • Pascal, Modula-2, Oberon by Wirth
  • All of these are written in themselves

Related Projects

  • Matju’s MetaRuby
  • YARV
  • JRuby

Matju’s MetaRuby

  • Different goal much more complex
  • Abstracted core classes

YARV

  • Ruby interpreter replacement

Rubidium

  • Ruby interpreter replacement
  • Rubidium is an optimizing Ruby interpreter

Rubytests

  • Unit tests for Ruby
  • Not comprehensive enough for our goals
  • Not much work making it more complex

JRuby

  • A 1.8.2 compatible Ruby interpreter
  • Most builtin Ruby classes provided
  • Support for interfacing and defining Java classes in Ruby
  • Uses Rubytests

Current Work

Methodology

  • Generate a stubbed class to overlay
  • Drive unit tests to failure
    • Identify core methods (primitives) that have to exist
    • Fix bad tests that pass despite no implementation
  • Drive all tests to green
    • Hack, hack, hack

Passing Tests

  • TrueClass, FalseClass and NilClass
  • Time
  • Range
  • NilClass
  • Array
  • String

Overlaid Classes

These classes overlay their core classes using Ruby’s C allocation and initialization methods replacing as many methods as possible

  • TrueClass
  • NilClass
  • Array
  • String

Replaced Classes

  • Time
  • Range
  • Hash

Rubytests

  • Stale
    • Mostly tests Ruby 1.6 language features
  • Low test coverage
  • Not fully converted to Test::Unit
    • Way too much code from pre-testunit

Test::Unit

  • Needs lots of methods to work
  • Too complicated to refactor
  • Working on core classes is hard

Future Work

Primitives

  • Will be automatically translated to C
  • What is a primitive?
    • Implement as much as possible in Ruby
    • Whatever is left becomes a primitive
      • Unless we can break it down
  • Choosing primitives is a discovery process

Ruby2c Translation

  • Ryan will cover this a lot more
  • Only necessary for primitives

Memory Allocation (Objects)

  • Currently Array and String sit on top of C Ruby
  • Write object allocation in pure Ruby using current memory system for all objects
  • Then we will replace the memory system with a pure Ruby system

Replace core ruby library

  • Works!
  • Well.. kind of..
  • Compiles
  • Links!
  • Segfaults!
  • Needs alot of ping pong

Far Future Work

The Groveling Commences

Parser

  • Ripper is our best target
  • Almost entirely Ruby already
  • Just one file is in C, which we can rewrite

Object System & Garbage Collector

  • Steal ideas from Sqeuak Smalltalk, Self, current Ruby
  • In theory it should be easy to do
  • In reality it will be hard to do well
  • We’d love someone to work on this

Interpreter

  • YARV or eval.c (Ruby 1.8)?
  • Rubidium?
  • Needs to we written in Ruby
  • We’d love someone to work on this

C Extensions & C Standard Library

  • Why are you writing pure C anyways?
    • Use RubyInline or DL
  • Probably need Ruby/C compatability stubs
    • Easy to generate
  • Will need to follow current Ruby/C naming conventions

Array#fill

Eight ways to call

  • array.fill(obj)
  • array.fill(obj, start[, length])
  • array.fill(obj.range)
  • array.fill {|index| block }
  • array.fill(start…

“foo”.sub(/f(o)o/) { $1 }

  • $1 is a “magick” read only global
  • $1 can’t be set from pure Ruby
  • So the interpreter needs to help us out
  • Applies to all match variables

String#split

  • Easy
  • “a b”.split # => [‘a’, ‘b’]
  • “a|b”.split # => [
  • “a1b”.split(/*\d)/) # => [‘a’,‘1’,‘c’]
  • Hard

Time.rb Needs Metal

  • Easy
    • the_time.month
    • the_time.to_f
    • etc
  • Hard
    • Time.now requries calling libc’s gettime method
    • Currently we have libcwrap.rb that uses RubyInline to call into C funcitons